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FOREWORD 


This  report  provides  some  new  insights  into  and  approaches  toward  image 
modeling  as  applied  to  target  Identification,  whether  that  target  be  a  ship,  an 
incoming  missile,  or  an  aircraft.  The  approach  taken  is  that  of  examining  the 
energy  in  prescribed  wave-bands  which  emanates  from  a  target  and  correlating  the 
emissions.  Typically  one  might  be  looking  at  two  or  three  infrared  bands, 
together  possibly  with  several  visual  bands  as  well.  The  target  is  segmented, 
using  both  first  and  second  order  modeling,  into  a  set  of  "interesting 
components,"  and  these  components  are  correlated  in  such  manner  as  to  enhance 
the  classification  process. 

A  Markov-type  model  is  used  to  provide  an  a. priori  assessment  of  the 
spatial  relationships  among  critical  parts  of  the  target,  and  a  stochastic  model 
using  the  output  of  an  initial  probabilistic  labeling  is  invoked.  The  tradeoff 
between  this  stochastic  model  and  the  Markov  model  is  then  optimized  so  as  to 
yield  a  best  labeling  for  identification  purposes.  In  an  identification  of 
friend  or  foe  (IFF)  context,  the  methodology  presented  in  this  report  could  be 
of  great  interest,  for  it  provides  the  ingredients  for  such  a  higher  level  of 
understanding. 

We  wish  to  thank  several  people  who  have  provided  both  ideas  and 
encouragement  for  this  investigation.  In  particular,  Mr.  Gordon  Powell  and  Mr. 
Kent  Anderson,  formerly  of  NSWC,  provided  interesting  comments.  Mr.  Powell, 
acquainted  the  author  with  much  of  the  literature  in  the  field.  Also,  Mr.  John 
Moscar  and  Mr.  Mike  Rowan  were  very  helpful  in  the  early  stages  of  the  effort, 
since  through  them  the  author  became  familiar  with  previous  applied  work  at 
NSWC.  With  regard  to  infrared  processing,  the  author  has  had  an  interesting 
working  relationship  with  Mr.  Donald  M.  Wilson  and  has  learned  much  about  the 
benefits  of  infrared  sensing  in  target  discrimination  from  him.  Finally,  the 
author  would  like  to  recognize  Or.  John  Wingate,  Dr.  Brooke  Stephens,  and  Dr.  K. 
Y.  Chien,  of  R44,  who  provided  interesting  mathematical  and  physical  insights. 
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CHAPTER  1 
INTRODUCTION 

In  recent  work  on  aircraft  identification  and  ship  classification,  the 

O 

classical  theory  of  invariants  was  employed  in  order  to  classify  a  target 
known  to  belong  to  one  of  a  given  number  of  target  classes.  The  classes  were 
constructed  in  the  following  manner:  A  target  prototype  from  the  class  was 
viewed  from  a  number  of  perspectives  and  under  ideal  imaging  conditions.  Then, 
thinking  of  the  two-dimensional  projections  of  the  target  as  binary  images,  the 
moment  area  method  was  used  to  construct  a  feature  vector  for  the  given 
perspective.  Each  feature,  being  a  proper  combination  of  moments,  had  three 
fundamental  invariance  properties,  which  were  Invariance  to  position  of  the 
image  in  the  field  of  view,  invariance  to  rotation  in  the  image  plane,  and 
invariance  to  range  of  the  target.  Of  course,  a  real  target  is  a  superposition 
of  an  ideal  target  and  background  noise  due  to  environmental  effects.  In 
addition,  noise  will  be  Introduced  by  the  Imaging  system  itself.  Therefore, 
before  one  can  successfully  apply  any  pattern  recognition  method,  he  must  be 
able  to  extract  this  noisy  target  from  the  background.  There  are  two  steps  to 
this  process:  (1)  The  picture  is  subjected  to  some  kind  of  smoothing 
operation,  so  that,  hopefully,  the  noise  is  reduced  to  a  manageable  level. 

(2)  Now  that  one  has  a  sensible  picture,  one  extracts  the  target  by  boundary 
identification,  which  often  requires  some  kind  of  gradient  algorithm.  Once  the 
object  has  been  extracted,  it  is  compared  to  the  data  base  of  ideal  target 
prototypes,  and  the  object  is  assigned  to  that  class  which  is  closest,  in  some 
well-defined  sense,  to  the  image  representation. 

We  would  like  to  emphasize  at  this  point  that  previous  work  has  often  been 
centered  around  a  rather  naive  interpretation  of  the  object  of  interest.  Each 
image,  be  it  optical,  infrared,  or  otherwise,  is  conceived  in  terms  of  two  color 
levels,  black  and  white,  with  black  being  assigned  a  1  and  white  a  zero  (or  vice 
versa).  Such  a  two-level  description  is  deemed  appropriate  from  the  point  of 
view  of  computational  feasibility,  provided  that  all  that  is  really  necessary 
anyway  is  a  crude  target  assignment.  However,  in  practice,  it  Is  easy  to 
imagine  a  situation  in  which  a  higher  level  description  is  not  only  desirable, 
but  indeed  necessary.  For  example,  it  Is  one  thing  to  determine  that  a  given 
ship  is  a  destroyer,  rather  than  an  aircraft  carrier  or  a  tanker,  but  it  may  be 
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quite  another  to  ascertain  whether  or  not  it  is  an  enemy  or  a  friendly 
destroyer.  (Imagine  a  combat  situation  in  which  such  discrimination  is 
crucial .) 

The  use  of  relaxation  methods  for  target  discrimination  seems  to  be  ideally 
suited  to  the  task  at  hand.  In  particular,  the  work  of  Rosenfeld,  Hummel  and 
Zucker  ^  appears  to  be  of  fundamental  interest.  One  sees  immediately  that  such 
methods  should  engender  a  higher-level  definition  of  the  object  if  one  can 
appropriately  define  the  components  of  interest,  together  with  a  meaningful 
assignment  of  their  interrelationships.  However,  what  one  does  not  usually  see 
in  the  literature  is  a  marriage  of  the  concepts  of  relaxation  and  multispectral 
classification.  This  would  seem  to  be  a  natural  extension,  and  one  of  the 
purposes  of  this  document  will  be  to  investigate  the  potential  of  such  a 
union.  In  addition,  we  shall  pursue  the  theory  of  contour  or  boundary 
extraction  in  the  context  of  multispectral  information.  We  note  that  some  work 
in  this  direction  has  been  pursued  recently  by  Eklundh,  Yamamoto,  and 
Rosenfeld,5  and  that  the  results  have  been  quite  encouraging.  Also,  we  shall 
invoke  the  work  of  Haralick,5  Nahi,^  Kaufman,**  and  Woods  9  and  attempt  to  unify 
their  concepts  in  a  meaningful  way. 

Our  analysis  will  begin  with  a  procedure  similar  to  that  of  Chen  and 
Pavlidis.10  That  is  to  say,  we  shall  show  how  to  devise  a  split-and-merge 
algorithm  for  our  purpose.  Implicit  in  this  methodology  is  the  incorporation  of 
the  idea  of  the  quad  tree.** 

It  is  hoped  that  this  work  will  prove  useful  in  more  carefully  defining  the 
structure  of  targets. 
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CHAPTER  2 

THE  GENERALIZED  SPLIT-AND-MERGE  ALGORITHM 

We  shall  describe  here  in  general  terms  what  we  propose  to  do  with  regard 
to  a  unification  of  the  work  of  Haralick  6  and  Chen  and  Pavlidis.1®  Haralick's 
sloped  facet  model  is  to  be  employed  in  conjunocion  with  the  splitting  phase  of 
the  Chen  and  Pavlidis  procedure.  That  is  one  new  idea.  Secondly,  we  shall 
superimpose  a  color  (or  pseudocolor)  scheme  for  two  reasons:  (1)  It  is  well- 
known  that,  by  so  doing,  one  may  make  the  target  intensity  profile  insensitive 
to  changes  in  illumination.^  This  is  a  distinct  advantage  from  a  physical 
point  of  view.  (2)  In  infrared  detection,  where  our  methods  may  prove  most 
valuable,  it  is  important  to  examine  gray-level  intensity  in  different  wave 
bands  and  to  correlate  the  results  so  obtained.  Such  methodology  has  proved  to 
be  of  great  value  in  LANDSAT  image  studies. 

Now  let  us  begin  with  the  idea  of  the  quad  tree.  Suppose  that  we  have  a 
rectangular  target  area,  as  shown  in  Figure  2-1.  We  shall  divide  this  rectangle 
into  four  congruent  subrectangles  as  shown.  Now  let  us  assume  that  our  target 
pixels  are  described  in  terms  of  normalized  primary  colors  (or  their 
equivalents).  That  is,  we  have  the  nonnegative  vector  (g^,  gR,  gg)t,  with 
9r  +  +  9r;  =  1.  Then  let  us  compute  the  sample  covariance  matrix  13  for  the 

entire  rectangle  and  compare  it  with  those  for  the  subrectangles.  Suppose  that 
K  is  the  overall  covariance  matrix  and  that  ,  1  <  i  <  4,  is  the  matrix  for 

subrectangle  i,  as  indicated  in  the  figure.  The  game  will  now  be 
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FIGURE  2-1.  THE  QUAD-TREE  SPLIT -AND -MERGE  ALGORITHM 


played  as  follows:  If  the  maximum  norm  of  the  difference  Detween  K  and  , 

i.e.,  max  ( | K  -  K .  1 1 ,  does  not  exceed  some  small  positive  ,  then  we 
1  <  i  <  4 

proceed  with  Haralick's  method  for  determining  whether  or  not  all  the 
subrectangles  belong  to  the  same  sloped  surface  (facet).  On  the  other  hand, 
when  the  max  of  the  norms  does  not  pass  the  epsilon  test,  we  do  not  use 
Haralick's  procedure.  In  that  case,  we  subdivide  rectangles  1,  2,  3,  and  4  into 
subrectangles.  Eventually  we  shall  arrive  at  a  dissection  for  which  the  equal 
covariance  assumption  is  valid.  From  that  point  onward  the  scheme  is 
applicable.  Boundary  determination  should  fall  out  of  this  process  naturally  in 
terns  of  small  boxes  containing  the  boundary  points,  as  illustrated  in  Figure 
?-l. 

What  we  really  have  in  mind  is  the  determination  of  a  number  of  pertinent 
components  of  a  target  which  are  useable  in  later  identifying  it.  The  idea  is 
that,  once  we  know  the  boundaries,  we  automatically  have  the  regions,  which  are 
formed  by  merging  of  subregions  having  the  same  parameters  associated  with 
them.  This  is  the  "merge"  part  of  the  algorithm.  The  next  chapter  is  devoted 
to  the  modification  of  Haralick's  scheme. 
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CHAPTER  3 

THE  GENERALIZED  SLOPED  FACET  MODEL 

In  what  follows  we  shall  make  use  of  the  quad-tree  concept,**  whereby  a 
rectangle  is  divided  into  four  equal  parts  and  each  part  is  again  divided  into 
four  parts  and  so  on,  until  a  certain  group  of  criteria  are  met.  As  shown  in 
Figure  3-1,  let  us  assume  that  we  have  a  rectangular  lattice  of  points  and  that 
the  equal  covariance  assumption  of  Chapter  2  Is  valid.  Point  0  is  the  center  of 
our  large  rec tangle,  and  point  Qj  is  the  center  of  subrectangle  i.  Note  that 
all  four  subrectangles  meet  at  0.  We  want  to  test  the  hypothesis  that  all 
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FIGURE  3-1.  TESTING  THE  SLOPED  FACET  MODEL  ASSUMPTIONS 
subrectangles  are  associated  with  the  same  sloped  planes.  We  are  assuming, 
therefore,  a  regression  model  of  the  following  type: 

gR(r,c)  =  aRr  +  eRc  +  yR  +  nR(r,c) 

9B(r,c)  •  cBr  +  sBc  *  +  nglf.c)  (3-1) 

gG(r,c)  *  afir  +  eQc  +  yQ  +  n6(r,c), 

where  9r  +  93  +  9s  a  1,  r  is  the  row  index,  and  c  is  the  column  index.  Of 

course,  the  constraint  that  all  three  color  components  sum  to  one  allows  us  to 
incorporate  into  the  state  vector  any  two  of  them.  We  shall  assume  that. 
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although  the  red,  blue  and  green  components  may  be  statistically  correlated  at  a 
given  point  in  the  image,  they  are  independent  from  point  to  point.  Note  that 
this  is  a  first-order  assumption,  which  is  given  for  computational  convenience; 
in  fact,  a  higher  order  model  will  be  given  later  in  this  report.  Selecting  the 
red  and  blue  quantities  in  (3-1),  we  present  the  matrix  formulation 


In  the  usual  way,  assuming  that  is  the  2  by  2  covariance  matrix  for  the  noise 
term  in  (3-2),  we  may  formulate  the  least  squares  problem  of  minimizing 


l  l 

reR  CeC 


:r,  c* 


(3-3) 


where  t  denotes  transpose  of  the  error  vector^  c,  R  is  the  row  index  set  for 
the  rectangle,  and  C  is  the  column  index  set.  Furthermore, 


\9b/  \  b  6b  ^b/^1' 


(3-4) 


where  the  matrix  in  (3-4)  is  the  array  of  expected  coefficients.  Note  now  the 
fact  that  is  symmetric  and  positive  definite,  so  that  it  may  be 
diagonalized  by  an  orthogonal  matrix  P: 

0  =  P  K'1  Pl,  (3-b) 

n  n 

where  D  =  diag  (a,,  \0)  and  A,,  A0  are  positive.  We  are  here  assuming,  of 
^  ^  *■  ^  14  -1 

course,  that  the  process  is  erqodic  so  that  is  estimable  by  means  of 
the  spatial  information  given  in  the  gray  level  profiles. 

Substituting  (3-5)  into  (3-3),  one  has  to  minimize 


I  I  £ 

reR  ceC 


t 


r,c 


D 

n 


where  f„  =  P  e„  .  Thus 
-r,c  -r,c 


(3-6) 
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\%V  \“B '  sb'  V/ W 

It  follows  that  (3-6)  may  be  rewritten  as 


(3-7) 


er2  =  *i  I  I  [aR'r  +  gR'c  +  yr'  -  gR*  (r,c)]2 
r  c 


(3-8) 


+  *2  £  I  [®b ' r  +  eB'C  +  yb'  ‘  9b'  (r»c)^2* 

r  c 

A  A  A 

Differentiation  of  (3-8)  with  respect  to  aR',  br',  and  yR',  imposition  of  the 

conditions  \  r  -  \  c  =  0  (corresponding  to  appropriate  choice  of  origin  0  as 
reR  CeC 

indicated  in  Figure  3-1),  and  setting  the  results  equal  to  zero  lead  to 

A  A  A  A  A  A 

expressions  for  cxR‘,  gR‘,  yR‘,  aB‘,  gg',  and  aB‘  in  terms  of  their  true  values 

and  additive  noise  terms.  The  optimal  values  for  aR‘,  gR ' ,  and  yR'  are  then 


V  =  “r'  +  I  I  rnR‘  l  r 

r  c  re 

Vs  Br*  +  l  l  cnR-  (r,c)/  l  l  c2 
r  c  re 


Tr '  =  V  +  I  I  nR'  ( r,c)/l  l  1, 
r  c  r  c 


(3-9) 


where  aR‘,  gR',  yr'  ace  the  true  values  of  the  parameters.  If  one  replaces  the 
subscript  R  by  B,  one  obtains  the  analogous  relations  for  the  blue  parameters. 
One  sees  from  (3-9)  and  the  corresponding  blue  relations  that  the  estimates  are 
unbiased.  Furthermore,  using  the  covariance  matrix  for  the  noise,  namely, 

D  one  may  develop  the  covariance  matrix  for  the  estimates.  This  is  seen  to 
be  a  six  by  six  diagonal  matrix  with 
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''(«„•)  ‘  «i2/  I  l  r2 
r  c 

v(6R,}=a12/2  l°2  (3-10) 

r  c 

V(V>  ’  “1  l'l  l  !. 

r  c 

/-  o 

just  as  in  Haralick's  paper,  where  =  1/x ^ .  Similarly,  for  the  blue 
parameters  one  gets  expressions  analogous  to  (3-10)  with  -  l/x^ 
replacing  <j^‘  The  covariance  terms  are  zero  as  before. 

Now  consider  any  two  subrectangles,  as,  for  example,  those  with  local 
origins  0^  and  O4  in  Figure  3-1.  Note  that  these  subrectangles  meet  at  0,  which 
is  also  the  midpoint  of  the  line  segment  joining  Oj  to  O4.  Assuming  a  linear 
model  representation  over  both  subrectangles,  we  would  like  to  test  the 
hypothesis  that  both  linear  facets  (for  red  and  blue  components  individually) 
are  really  part  of  the  same  facet.  For  this  to  happen,  clearly  a^R'  =  c^', 

31R*  =  e4R*’  “lB*  =  01  4b  '  ’  and  6ib  '  =  64b'*  Now  SuPP0Se  that  re9ions  1  and  4  ar® 
equally  sized,  i.e.,  have  the  same  number  of  grid  points  similarly  placed,  as 

indicated  in  Figure  3-1.  Furthermore,  suppose  that  0  is  not  a  grid  point  of 

either  region.  For  purposes  of  calculation,  then,  the  two  subrectangles  are 

mutually  exclusive,  and  we  find  that  both  N^R  =  (ajR'  -  <*4R')(  l  l  rV2)*^ 

and  N2R  =  (31r'-  $4R  1 ) (  l  l  c 2/2)^2  are  normal  random  variables  &ith  mean  0 
9  re 

and  variance  a.  .  Similarly,  one  observes  that  their  blue  counterparts  have 
*  2 

mean  0  and  variance  a2  .  Now,  as  in  Haralick's  paper,  we  observe  that,  in  order 
for  the  sloped  surfaces  to  coincide,  it  follows  that  their  heights  must  coincide 
at  point  0.  The  coordinates  of  U  relative  to  0^  are  seen  to  be  (ar/2,  ac/2), 
and  those  of  0  relative  to  O4  are  (-Ar/2,  -ac/2),  where  (Ar,  ac)  represents  the 
position  of  O4  relative  to  0j.  For  the  red  component,  the  true  height  of  the 
surfaces  at  point  0  is  given  by  alR'Ar/2  +  b1r'ac/2  +  y1r'  and  o4R'(-Ar/2) 

+  b4r 1 (-ac/2)  +  Y4R',  respectively.  Therefore,  under  the  same  sloped  surface 
hypothesis,  we  must  have 

(■V  ♦  °4R>r'2  *  <V  +  e4R  ■  )ac/2  ♦  Y1R'  -  V  *  0  (3-11) 
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Once  (3-11)  is  true,  it  is  seen  that 


N3R  =  (“lR*  +  a4R ’ ^Ar/2  +  1R '  +  64r')4c/2  +  yJR  -  Y4R‘ 


(3-12) 


is  normal  with  mean  0  and  variance 


cRZ  -  20l2[(ar/2)2/  l  l  r2+(ac/2)2/  £  £  c2+l/  J  J  1]. 

re  r  c  r  c 

We  note  also  that 


(3-13) 


e[(o1R'  +  «4R')(oiR'  -  a4R  '  ^  ^  "  El*elR  '  +  64r’^61r’  "  e4R  '  ^  =  °* 


Letting 


eiR2=  1  I  CaiR'r  +  BiR'c  ♦  y1r'  -  9R'(r,c)]2,  1-1.4, 
r  c 


2  2 

and  using  the  results  of  Haralick's  paper,  we  note  that  e1R  /oj  and 
e4R2/°l2  are  independent  chi-squared  random  variables,  each  with 

£  £  1  -  3  degrees  of  freedom.  Note  also  that  it  makes  sense  for  us  to  consider 
r  c 

the  red  and  blue  facets  individually  in  the  transformed  (prime)  variables,  since 

A  A  A  A  A  A 

the  estimators  aR',  BR‘,  yR>  <*g'»  Bg*,  and  yg '  are  statistically  independent. 

Therefore,  one  is  led  to  consider  the  F  statistic 


(3-14) 


<y^  h?r2  * 

[(elRZ  ♦  ewZ )/(2  !  i  1  -  « 
r  c 


(3-15) 


and  a  similar  statistic  Fg  for  the  blue  component.  If  FR  and  Fg  are  both 
sufficiently  small,  we  agree  to  accept  the  hypothesis  that  rectangles  1  and  4 
are  part  of  the  same  component. 

Now  let  us  adopt  a  general  point  of  view.  Suppose,  from  practical 
considerations,  that  we  decide  that  the  "hottest  part  of  a  ship"  (which  might  be 
its  engine  room)  is  an  item  of  critical  interest  as  far  as  identification  of  the 
ship  is  concerned.  Let  us  also  agree  that  hot  corresponds  to  dominance  in 
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amplitude  of  the  blue  component  over  the  red  component.  (In  infrared 
terminology,  one  might  be  looking  at  radiance  in  one  wave  band  as  opposed  to 
another.)  We  agree  to  examine  closely  those  pixels  for  which  gg(r,c)  >  T,  where 
T  is  a  high  threshold.  Referring  to  Figure  3-1,  let  us  examine  rectangles  1,  2, 
3,  and  4  in  turn.  Now  simply  count  the  number  of  pixels  n^  for  which  gg  >  T, 
where  nj  is  the  number  of  such  elements  in  box  i.  Suppose  it  turns  out  that 
only  box  1  contains  pixels  for  which  the  threshold  criterion  is  valid.  Then,  if 
the  equal  covariance  matrix  assumption  is  valid,  we  apply  the  analysis  of  the 
preceding  paragraph  to  determine  whether,  in  turn,  rectangles  1  and  4,  1  and  2, 
and  1  and  3  can  be  considered  as  part  of  the  same  sloped  surface.  We  store  the 
parameters  for  each  rectangle  in  the  process,  so  that  we  can  recover  them  for 
later  use.  If  rectangles  1  and  2  are  part  of  the  same  surface,  but  that  surface 
is  not  conti nuable  into  boxes  3  and  4,  then  we  have  located  our  region  of 
interest,  namely,  the  union  of  rectangles  1  and  2,  whose  boundary  is  just  the 
horizontal  line  bisecting  the  big  rectangle.  When  analyzing  regions  1  and  2  to 
determine  whether  the  same  facet  applies,  we  use  point  P  as  that  point  for  which 
the  two  facets  should  agree.  Similarly,  for  boxes  1  and  3,  we  use  point  Q.  Now 
suppose  that  the  sloped  facet  for  rectangle  1  is  not  extendible  into  any  ov  the 
remaining  boxes.  If  that  is  the  case,  we  subdivide  rectangle  1  into  four  boxes 
and  apply  our  threshold  logic  again  to  determine  those  subrectangles  containing 
pixels  for  which  gR  >  T.  Let  us  record  these  rectangle  numbers.  Also,  observe 
that  subrectangle  of  rectangle  1  containing  the  largest  number  of  pixels  for 
which  gR  >  T.  Test  that  subrectangle  against  the  other  subrectangles  of  box  1 
to  see  whether  its  sloped  facet  is  extendible.  If  so,  we  mark  off  the 
appropriate  subrectangles  and  the  parameters  associated  with  them  and  proceed  to 
examine  the  remaining  subrectangles  containing  pixels  for  which  gR  >  T. 
Otherwise,  let  us  further  subdivide  the  given  subrectangle  and  reapply  our 
procedure.  Eventually,  one  of  two  situations  will  occur:  (a)  We  find  a  subbox 
completely  contained  within  the  component  of  interest,  together  with  other 
subboxes  containing  part  of  the  component,  (b)  We  arrive  at  a  box  containing 
too  few  pixels  to  apply  the  sloped  facet  mode.  In  case  (a),  we  simply  record 
the  subbox  of  interest,  together  with  its  a,  0,  and  y  parameters  and  proceed  to 
backtrack  one  step  up  the  quad-tree,  looking  at  the  remaining  boxes  containing 
the  pixels  of  interest.  We  use  the  maximum  number  criterion  again  when  we 
dissect  the  remaining  boxes.  Case  (b)  can  arise  in  two  ci rcumstances:  The 


3-6 


NSWC  TR  84-54 


first  occurs  when  every  box  which  could  be  within  the  component  of  interest 

contains  too  few  pixels.  This  is  a  signal  that  either  our  pixel  structure  is 

too  course  to  resolve  what  is  happening  or  our  threshold  criterion  is  too 
stringent  or  both.  The  second  circumstance  is  that  in  which  we  are  indeed  in 

the  vicinity  of  a  boundary  point.  In  that  case,  we  may  mark  the  box  as  a 

"boundary  box."  We  then  back  up  one  step  in  the  quad-tree  and  continue  the 
process  to  find  the  remaining  boundary  points.  The  entire  scheme  should  yield 
both  the  red  and  blue  parameters  for  the  region  of  interest,  together  with  the 
boundary  boxes.  A  polygonal  line  can  then  be  drawn  through  such  boxes,  and  the 
resulting  line  can  be  accepted  as  a  first-order  approximation  to  the  boundary. 
The  logic  involved  in  this  process  is  flow-charted  in  Figure  3-2. 


RECORD 
APPROPRI¬ 
ATE  BOXES 


ELIMINATE 
SUCH  BOXES 
FROM  SAMPLE 


/  EQUAL  \ 
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FINO  BOX 
WITH  njD 
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INPUT 


RECORD 
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FIGURE  3-2.  FLOW  CHART  FOR  QUAO-TREE  ASSIGNMENT  OF  COMPONENT 
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Now  suppose  that  we  have  prioritized  the  subcomponents  of  the  target  of 
interest  to  us.  Let  us  agree  that  the  blue  component  intensity  is  of 
interest.  Scanning  the  target,  one  finds  that  the  largest  value  of  gg(r,c) 
is  Tj.  One  then  looks  for  the  component  (or  components)  containing  such 
pixels.  Physically,  such  components  might  be  the  hottest  parts  of  the 

target.  Having  found  those  portions  of  the  image,  we  eliminate  them  from 
consideration  and  focus  our  attention  on  the  remaining  pixels.  Continuing  with 
the  same  line  of  reasoning,  let  us  now  find  those  pixels  in  the  rest  of  the 
figure  for  which  gB(r,c)  is  again  maximum,  i.e.,  gB(r,c)  =  T2.  Pursuing  again 
our  quad-tree  analysis,  we  find  those  subcomponents  containing  the  pixels  for 
which  gB(r,c)  =  T2.  Continuing  this  process  would  obviously  lead  to  a  first- 
order  dissection  of  our  rectangle  into  all  its  primitive  subregions.  Of  course, 
since  gB  is  a  noisy  field,  it  is  not  necessarily  the  case  that  the  components 
containing  those  pixels  for  which  gB  is  maximum  are  indeed  those  corresponding 
to  the  hottest  portions  of  the  target.  That  is,  the  smoothed  versions  might 
actually  correspond  to  other  subregions.  If  that  is  the  case,  so  be  it.  All 
that  one  can  say  is  that  the  noisy  version  gives  us  a  priori  information  and 
that  we  shall  hopefully  learn  through  a  posteriori  analysis  the  underlying 
nature  of  the  components  of  interest.  If  all  goes  well,  there  will  be  a  minor 
reshuffling.  Notice  that  there  is  a  potential  saving  in  this  approach  in  that 
perhaps  only  a  certain  number  of  subcomponents  of  a  target  may  be  critical  to 
its  classification.  Also,  when  all  is  done,  (3-8)  should  serve  as  a  weighted 
measure  of  just  how  well  a  sloped  facet  model  of  this  type  represents  the  data. 

THE  SOBOLEV  MODEL 

Before  leaving  this  topic,  we  shall  introduce  a  new  concept  which  might  at 
times  prove  useful  in  generating  a  sloped  facet  target  description.  The  idea 
will  be  to  use  the  Sobolev  norm  as  either  a  validation  tool  or  possibly  as  a 
means  to  obtain  a  better  sloped  facet  model.  We  must,  of  course,  explain  what 
such  a  norm  is  and  how  it  might  be  helpful.  First  of  all,  let  us  consider  the 
1-dimensional  case.  Then  the  norm  is  given  by 

ll'lls2  =«llfl|2L2+  ('-«>  llf'H2LZ  <3-16> 

for  any  continuously  differentiable  function  f,  where 
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I  lf  I  I  2  2  =  /b  f 2W  dx» 

L  a 

0  <  o  <  1,  and  the  domain  of  definition  of  f(x)  is  the  closed  interval  [a,b]. 

Now  a  discrete  version  of  (3-16)  may  be  given  in  which  one  replaces  th  ‘  notion 
of  derivative  by  that  of  divided  difference  and  in  which  integration  is  replaced 
by  summation. 

Refer  then  to  Figure  3-3.  Here  the  function  f(x)  is  defined  only  at  the 


FIGURE  3-3.  POLYGONAL  LINE  APPROXIMATION  TO  f(x)  ON  ORIGINAL  GRID 
control  values  x^ ,  0  <  i  <  n,  and  the  polygonal  line  indicated  simply  connects 
all  of  the  points  (x1-,f(x^))  in  some  systematic  fashion.  As  drawn,  the  average 
slope  of  the  individual  line  segments  might  reasonably  approximate  that  of  the 
line  PQ.  However,  the  variance  of  these  slopes  from  the  average  may  be 
substantial.  If,  on  the  other  hand,  we  double  the  mesh  size,  so  that  we  sample 
only  the  values  of  f  at  xg,  X2»  X4,...,  we  see  that,  roughly  speaking,  as 
indicated  by  the  dashed  lines,  the  new  set  of  slopes,  half  of  the  previous 
number,  still  may  have  an  average  value  close  to  the  true  slope,  but  now  the 
variance  of  the  slopes  seems  substantially  smaller.  Therefore,  a  rather 
interesting  validation  procedure  suggests  itself.  What  happens  if  one  decides 
to  use  only  the  values  of  f ( x )  on  the  coarse  mesh,  together  with  the  slopes 
(divided  differences)  as  approximations  to  the  derivatives,  the  slope 
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Informatic  eplacing  half  the  function  information,  and  formulate  a  least 
squares  problem  using  a  Sobolev  norm  adapted  to  the  mesh?  How  would  the 
regression  line  obtained  compare  with  that  procured  by  using  all  of  the  function 
information  on  the  original  mesh? 

The  above  is  a  one-dimensional  version  of  the  process  in  which  we  are 
really  interested,  the  latter  being  two-dimensional.  In  our  situation  we  have  a 
two-dimensional  grid  of  points.  Suppose  that  one  finds,  by  conventional  means, 
that  a  sloped  facet  reasonably  represents  the  noisy  data.  Could  one  have  done 
better  by  doubling  or  perhaps  tripling  the  mesh  size,  eliminating  half  or  more 
of  the  original  grid  points,  and  substituting  for  the  red  and  blue  intensity 
values  divided  difference  approximations  to  their  first  partial  derivatives? 

The  new  least  squares  problem  would  then  involve  a  finite  Sobolev  norm.  The 
question  to  be  asked  is:  Would  the  new  technique  yield  an  improvement  on  the 
least  squares  fit?  If  so,  how  much  of  an  improvement  would  there  be? 

The  answers  in  one  dimension  were  worked  out  by  the  author  ten  years 
ago.16  There  it  was  shown  that  one  could  derive  estimators  that  were 
theoretical ly  better  than  the  classical  one.  It  was  also  shown  in  another 
report,16  in  which  a  shock  tube  experiment  was  involved  for  fitting  time  of 
arrival  and  shock  peak  pressure  data,  that  such  techniques  could  be  used  for 
validation  purposes,  in  that  case  for  determining  whether  the  fit  to  pressure 
obtainable  through  differentiation  of  a  linear  fit  to  time  of  arrival  data  was 
reasonable.  We  shall  here  adapt  the  work  in  the  former  report  15  to  our  2-D 
problem.  Let  us  assume  that,  at  each  pixel  (r,c),  we  have  three  measured 
quantities,  written  in  vector  fashion  as  (x(r,c),  xr(r,c),  xc(r,c))t,  where  xr 
signifies  the  partial  derivative  of  gray  level  in  the  r  direction  and  xc 
indicates  the  partial  derivative  in  the  column  direction.  Of  course,  e(r,c) 
itself  is  to  be  the  gray  level  at  pixel  (point)  (r,c).  In  line  with  our 
previous  development,  x  would  generally  be  either  a  red  or  blue  intensity.  One 
would  use  a  divided  difference  at  (r,c)  in  the  r  direction  to  ascertain  xr  and, 
similarly,  a  divided  difference  in  the  c  direction  to  obtain  xc. 

There  are  three  types  of  divided  differences  which  could  be  used,  namely, 
forward  differences,  backward  differences,  and  central  differences.  Inasmuch  as 
there  are  advantages  and  disadvantages  to  the  use  of  the  different  types,  we 
shall  present  one  model  based  on  the  use  of  forward  differences  and  another 
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predicated  on  utilization  of  central  differences.  The  latter  would  be  more 
accurate  and  more  straightforward  except  for  the  fact,  as  we  shall  see,  that  a 
coarser  grid  is  demanded.  We  shall  also  see  that  forward  differences  have  the 
disadvantage  of  being  correlated  with  function  values,  whereas  central 
differences  are  not,  provided  the  points  used  to  calculate  them  are  adequately 
separated. 

Suppose  that  the  pixels  are  so  separated  that  x(r,c)  is  independent  of 
x(rlfcj)  for  (r,c)  *  (r^cj).  Furthermore,  assume  that 

x(r,c)  =  or  +  8c  +  y  +  n^r.c),  (3-17) 


where  estimates  of  o,  8,  and  y  are  to  be  obtained  and  n^(r,c)  is  a  zero-mean 

p 

Gaussian  random  variable  with  variance  a  .  We  hypothesize  that  n^(r,c)  has 

O 

variance  a  regardless  of  the  particular  point  (r,c)  involved.  Let  us  treat 
first  the  forward  divided  differences,  and  suppose  that  we  want  to  form  xr(r,c), 
an  estimate  of  the  partial  derivative  in  the  r  direction.  We  do  this  with  some 
care  so  as  to  insure  that  xp(r,c)  is  independent  of  x(rj,Ci),  xr(r1,ci),  and 
xc(ri,Cl)  whenever  (r,c)  *  (rj.Cj).  Let  us  suppose  that  x(r,c)  and 
x(r  +  2Ar,c)  are  two  intensity  values  at  adjacent  grid  points  in  a  column. 

Assume  that  the  value  x(r  +  Ar,c)  is  available  and  is  statistically  independent 
of  x(r,c).  Then  we  shall  form  the  divided  difference  of  x(r,c)  and  x(r  +  Ar,c) 
and  use  that  as  the  estimate  for  xr(r,c).  Now  we  find  that 


„  i _  _  x(r  +  Ar,c)  -  x(r,c)  _  nl^r  +  Ar,c^  "  nl(r»c) 

Vr’c;  "  Ar  "  a  Ar 


(3-18) 


Two  facts  emerge  immediately  from  (3-18).  First,  xp(r,c)  is  an  unbiased 
estimator  of  a.  Secondly,  the  noise  term  for  xr(r,c)  has  been  derived  using  (3- 
17),  but  is  still  Gaussian  with  mean  0.  In  addition  it  has  variance 

p  2 

2a  /(a r)  .  Another  observation  that  we  can  make  is  that  xr(r,c)  will  not  be 
correlated  with  the  vector  (x(rj,cj),  xr(rj,Cj),  xc(rj,cj)),  since  the  latter's 
noise  terms  do  not  appear  in  (3-18).  Using  (3-17)  and  (3-18),  one  sees  that 
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n, (r,c)  (n, (r  +  Ar,c)  -  n, (r,c)) 
cov  (x(r,c),  xr(r,c))  =  E  [— - — - ] 


E  (n^Cr.c)) 


A  r 


(3-19) 


-  -a  /at. 


Furthermore,  the  correlation  coefficient,  defined,  as  usual,  as 


P ( x , Xf )  =  cov  (x,xp)/oxax  , 

r 


becomes 


p(x,xr)  =  -1//2. 


(3-20) 


Similarly 


*c(r,c)  =  8  + 


ni  (r,c  +  Ac)  -  n-i  (r,c) 


AC 


(3-21) 


representing  the  partial  derivative  in  the  column  direction.  We  assume,  as 

before,  that  adjacent  points  in  a  row  are  representable  by  x(r,c)  and 

x(r,c  +  2ac).  One  finds  that  xc(r,c)  is  an  unbiased  estimator  of  8  and  that  the 

2  2 

variance  of  xc(r,c)  is  2a  /(ac)  .  Analogous  to  (3-19)  one  has 


cov  (x(r,c),  xc(r,c))  =  -  ac/AC. 


(3-22) 


Of  course,  the  correlation  coefficient  will  be  the  same  as  in  (3-20).  Finally, 
we  want  cov  (xr(r,c),  xc(r,c)).  Invoking  (3-18)  and  (3-21),  we  have 


cov  (x  (r,c),  x  (r,c))  =  a  /Ar  ac. 


(3-23) 
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Our  model  equations  may  be  represented  by 


x(r,c)  =  ar  +  gc  +  y  +  n^r.c) 


xp(r,c)  =  a  +  n2(r,c) 


(3-24) 


xc(r,c)  =  g  +  n3(r,c). 


where  n2(r,c)  is  given  in  eq.  (3-18)  and  n3(r,c)  in  eq.  (3-21). 
matrix  for  (3-24)  is 


The  covariance 


K  = 


2  2 
-a  /Ar  -c  /AC 


-0^/Ar  2o^/(Ar)2 


o^/ArAc 


|_-o  /ac  a  /ArAc  2o  /(ac)‘ 
2 


a(Ar)‘ 


a(Ar)' 
-aAr 
-at 


-aAr 

2a 

1 


-Ar 

1 

2/a 


(3-25) 


where  a  =  ac/a r  is  the  aspect  ratio  for  the  grid.  The  inverse  matrix  for 
(3-25),  often  called  the  information  matrix,  is 


K 


-1  _  a(Ar)' 


*3/a(Ar)^ 

1/aAr 

1/Ar“ 

1/aAr 

1/a 

0 

• 

_1/Ar 

0 

a  . 

1 ikel ihood 

function 

for  our 

(3-26) 
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Mo  2\  n  \”3MN/2/ .  .  „.-MN/2 
L(a,6,y,a  )  =  (2*)  (det  K) 

exp  C-  y  .1  .(n1(ri,cj.),n2(ri,cj.),n3(ri,Cj))  K"1  (3-27) 

i » J 

.(n1(r.  ,Cj  ),n2(ri  ,Cj  )n3(ri  tc.  ))t], 

where  det  K  is  the  determinant  of  K  and  is  a®/ a2(Ar)4,  as  follows  using 
(3-25).  Now  maximization  of  (3-27)  is  certainly  possible,  but  the  calculation 
of  the  solution  to  the  likelihood  (normal)  equations  is  laborious.  It  would  be 
computationally  advantageous  perhaps  to  approximate  the  information  matrix 
by  a  diagonal  matrix.  It  turns  out  that,  for  large  Ar,  the  off-diagonal  terms 
in  (3-26)  may  be  deleted  without  appreciable  loss.  In  fact,  we  shall  show  that 
1 1 K ~^-K 3 ) | / 1 |K^_1| |  +  0  as  ir  +  +  •,  where  is  the  diagonal  matrix  diag 
2  2  2  2  2 

( 3/a  ,  (at)  /a  ,  (ac)  /a  ).  The  double  bars  denote  the  norm,  here  taken  to  be 
Euclidean,  of  the  matrix.  Using  (3-26),  we  have 


K1”A|  |  «  (2(a^+  l))l/^r/(9  +  (&r)4+  a4(Ar)4)1/2, 


(3-28) 


6  2  4 

from  which  the  result  follows  easily.  Also,  det  Kj  =  a  /3a  (Ar)  .  Let  us  form 
this  new  likelihood  function,  which  we  shall  call  Lj,  and  consider  its  negative 
logarithm.  One  wants  to  minimize 

-log  Lj  =  MN  (3  log  2tt  +  log  (o^/3a2(Ar)4))/2 

(3-29) 

+  .1  .[3ni2(ri,cj)  +  (Ar)2n22(r. ,Cj)  +  (Ac)2n32(ritCj)]/Zo2 

1  9  J 

o 

over  (a,  e,  y,  a  ).  In  order  to  do  this,  we  would  differentiate  (3-29)  with 

2 

respect  to  a,  8,  y  and  a  in  turn.  Setting  the  four  quantities  thus  obtained  to 

2 

zero,  we  would  solve  for  a,  6,  y  and  a  .  We  suppose  that  the  solution  exists 
and  is  unique. 

A  A  A  A  O 

The  proof  that  the  quantities  a,  8,  y,  a  thus  obtained  represent  the 
maximizing  point  of  (or,  equivalently,  the  minimizing  point  of  -log  Lj)  is 
not  too  difficult.  In  fact,  let  us  first  establish  the  following:  For  any  V>0, 
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2  2  2  2  2  2 

there  exists  an  R  >  0  such  that  -log  L1  >  V  whenever  a  +  B  +  y  +  (a  )  >  R  . 
Note  that 

-log  L,  =  MN  (3  log  2n  +  log  (o^/3a2(Ar)^)  )/2 


+  X  [3  lxij  -°ri  -»cj  +  (4r)2(xnj  -a)‘ 
■ 


(3-30) 


+  Uc)2(xcij  -  8)2]/2a2, 


where  xpij  and  xclj  are  the  measured  slopes  for  pixel  (r^,Cj)  in  the  r  and  c 
directions,  respectively,  M  is  the  number  of  columns,  and  N  is  the  number  of 
rows.  Also,  observe  that 


-o2log  Lj  =  a20(log(o2))  +  g(a,B,y) 


=  f  (a  )  +  g(a,8,Y)» 


(3-31) 


so  that  f(o2)  +  +  <»>  as  a2  >  +  «  and  f(c2)  0  as  a2  -*■  0.  We  see  that  (3-31)  is 

the  sum  of  two  quantities,  one  of  which  is  a  function  of  a 2  alone  and  the  other 

2 

of  which  is  a  function  of  a,  6,  and  y.  Furthermore,  although  f(a  )  becomes 
2 

negative  in  0  <  a  <  1,  it  is  bounded  there,  i.e.,  there  exists  a  positive 

2  2 

constant  k  such  that  |f(a  )|  <  k  whenever  a  is  in  [0,1],  Also,  it  is  well- 

1  o 

known10  that  there  is  an  R,  >0  such  that  g(a,B,y)  >  V  +  k  whenever 

2  2  2  1  2  2 
a  +  B  +  y  >  Rj-  1.  Therefore,  -a  log  L1  >  V  for  a  <  1  and 

a2  +  B2  +  y2  +  (a2)2  >  R^2.  Since  a2  <  1,  -  log  >  V.  Now  suppose  that 

cj2  >  1,  and  note  that  there  is  a  positive  number  P  such  that,  whenever 

2  2  2  2  2 

(<j  )  >  P  ,  f (a  )/a  >  V.  In  that  case,  -  log  L,  >  V  automatically.  Finally, 

2  1 

assume  that  1  <  a  <  P.  Note  then,  by  reasoning  similar  to  that  given 

2 

previously,  that  there  exists  an  R2  >  0  so  that  g(a,B,Y)/o  >  g(a,8,Y)/p  >  V 

2  2  2  2  2 

whenever  a  +  8  +y  >  "  p  •  Uncler  these  conditions  it  likewise  follows 

that  -  log  L,  >  V.  If  we  let  R  =  max(Ri,  Ro),  we  have  that  -  log  Li  >  V 

2  1  2  2  2  2  2  1  ^  A 

whenever  a  +  8  +  y  +  (o  )  >  R  .  On  the  other  hand,  by  continuity,  for 

any  e  >  0,  there  exists  a  6,  sufficiently  small  and  positive  such  that, 

7  7  7  7  7  ^7  7 

when  a  +  B  +  y  +  (a  )  <  ,  | -a  log  -  g(0,U,0)|  <  e.  That  is,  on  such 

a  closed  ball. 


(3-32) 
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g(0,0,0)  -  e  <  -a2  log  L ^  <  g(0,0,0)  +  e 

Assuming  that  g(0,0,0)  >  0,  one  ascertains  e  so  that  g(0,0,0)  -e  >  0,  also.  Now 
let  62  *  (g(0,0,0)  -  e )/V,  and  set  6  =  min  F rom  (3-32) 


1  /  2 


(*o  log  4)  >  (g(0,0,0)  -  e )/( (g(0,0,0)  -  e)/V)  =  V. 


(3-33) 


From  (3-33)  it  follows  that  -  log  L,  >  V  on  the  closed  ball 

p  p  p  p  p  p  p  ^ 

{ (<x,b,y ,a  ) | a  +  6  +  y +  {a  )  <  6  }.  If  we  now  choose  the  compact  set 
C  =  {(a,8,y,a2)|62  <  a2+  g2+  y2  -  (a2)2  <  R2},  where  xQpt=  (a,8,y,o2)eC,  it  is 
clear  that  xQpt  is  the  minimizing  point.  Remembering  that  £  r^  =  l  Cj  =  0,  the 
(unique)  solution  is  found  to  be 


«  =  (3  I  r.x.  +  (Ar)2  l  l  x  ..)/M(3  £  r  2  +  (Ar)2N) 

i  i  j  J  i 

3  =  O  I  c  x  +  (ac)2  l  l  x  .  .)/N(3  l  c  2  +  (ac)2M) 

j  J  J  i  j  c  J  j  J 

r  -  l  x./MN 
i 

i2  ={  J  l  [3<x(J  -  -  ,)2  ♦  (ir)2(xr1j-„)2 

*  J 

+  (Ac)2(xcij  -  6)2>/3MN, 


(3-34a) 


(3-34b) 


(3-34c) 


( 3— 34d ) 


where  x.  =  l  x.  .  and  x  .  =  l  x. . .  Here  a,  6,  and  y  are  seen  to  be  unbiased 
j  J  J  i  ij 

estimators  of  a,  8,  and  y,  respectively. 

A  A  A 

We  next  would  like  to  compute  the  covariance  matrix  for  a,  8,  and  y.  This 
is  a  straightforward  exercise,  using  the  fundamental  distributive  property 
cov  (X  +  Y,  Z)  =  cov  (X,Z)  +  cov  (Y,Z),  where  X,  Y,  and  Z  are  any  three  random 
variables  and  cov  means  covariance.  Again,  using  the  covariance  matrix  K  and 

the  fact  that  Y  r,  «  £  c.  =  0,  one  sees  that 
J  J  J 


var  (a)  =  a2<9  \  r.2+  2N(Ar)2)/M  (3  £  r.2+  (Ar)2N)2 

var  (8)  =  o2(9  Y  c-2+  2M(ac)2)/N  (3  £  c.2+  (ac)2M)2 
J  J  j  J 


(3-35a) 


(3-35b) 
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var  (y)  =  cr  /MN 


cov  (a, 6)  *  a2Ar  Ac/(3  £  r.2+  (Ar)2N)  (3  J  c,2+  (Ac)2  M) 

1  j  J 

cov  (a,Y )  =  -a2Ar/M  (3  £  r^2+  (Ar)2N) 

cov  (8,y)  =  -a2Ac/N  (3  J  c.2+  (Ac)2M). 

j  J 


(3-35c) 


(3-3bd) 


(3-3be) 


(3-3bf ) 


Note  that  all  of  the  variances  and  covariances  except  for  var  (y)  involve 
Ar  or  ac  or  both  Ar  and  ac.  Note  also  that  making  a r  and  Ac  larger  has  the 
effect  of  decreasing  the  values  of  these  parameters.  On  the  other  hand, 

reducing  the  number  of  points  decreases  M  and  N  and  the  sums  of  the  r^  and 

2 

Cj  .  This  sets  up  tradeoffs  which  we  shall  consider  in  more  detail  later. 
Let  us  note  that 


AAA  2  A  2  A  A  A  A 

var  (or-+  6c, +  y)  =  r,  var  o  +  c,  var  6  +  var  y  +  r.c.cov  (a,B) 

J  '  J  '  J 


(3-36) 


A  A  A  A 

+  r.cov  (a ,y )  +  c jCov  (b,y) 


for  every  pair  (i,j).  Therefore,  one  can  use  (3-35  a-f)  in  order  to  determine 
the  variance  of  the  estimated  facet  from  the  true  facet  at  any  point  (r^,Cj). 

Now  let  us  go  back  and  consider  the  real  likelihood  function  L,  as  given  by 
(3-27).  Forming  -  log  L,  as  we  did  before,  and  solving  the  normal  equations, 
one  finds  that 

[  £  (3r.+  Ar)x. .+  Ar J  (r.+  Ar)x  . .+  aAr  J  r.x  .  ArMNy ] 


M  [3  ^  N  (at)  J 

L  l  (3c. +  aAr)x. ,+  Ar  J  c,x  .  .+  aAr  Y  (c.+  Ar)x 


N  [3  l  c.  +  a  M(at)  J 

A  J 
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C3  I  ,xi  Ar.I '(rrij+  3  xcij}  “  ArMN  +  3^-* 

1  _ Lijl 


9 


(3-37c) 


where  a  =  Ac/Ar,  as  before.  As  opposed  to  (3-34c),  (3-37c)  contains  the 

A  A 

influence  of  the  r  and  c  partial  derivatives  of  x^j.  Note  that  a  and  b  depend 
solely  on  y.  Indeed,  (3-37a)  and  (3-37b)  may  be  substituted  into  (3-37c),  and 
one  obtains,  after  a  laborious  calculation. 


{3  l  xij.(3ri2+  (Ar)2)  (3Cj2+  (ac)2)  +  Ar }  (xrij-axrjj) 

i  ,j  1 »J 

(3r.2+  (at)2) (3cj2+  (Ac)2) 

y  =  -Ar(3Cj2+  (ac)2)  [J  (3r.+  Ar)  Ar J  (ri+  Ar)  xpij 

i ,  J  i » J 

tic  *  ' Vcij3 
■ 

-  Ar(37~?+  (Ar)2)  [a  £  (3Cj+ Ac)x1:j+ ac  ]>  CjXHj 

i ,j  1 »J 

+  aAc  l  (c,+  at)  x  ..]}  (3-38) 

■i  i  J  C1J 


(Ac)2(Ar)2+  6  (Ar)2Cj2  +  27  r.2  Cj2+  6  r^Ac)2 


3-19 


•V*  •  * 


V  -•  . 


where  r.  =  I  r.  /N  and  c.  =  Y  c.  /M.  We  shall  not  pursue  any  further  details 
i  i  j  4  1 

A  '  A  A  J 

concerning  a,  6,  and  y. 

Let  us  now  consider  the  case  where  central  differences  are  employed,  i.e., 
instead  of  (3-18),  one  uses 


,  x(r+Ar,c)  -  x(r-Ar,c)  .  nl(r+Ar,c)  -  ^(r-Ar.c) 

,(r,C) - 7T— - a  +  - 7T— - . 


(3-39) 


Then  one  sees  immediately  that,  provided  Ar  is  so  large  that  the  noise  sources 
are  uncorrelated,  x(r,c)  and  xp(r,c)  are  uncorrelated.  Furthermore,  we  can 
arrange  the  grid,  in  a  fashion  similiar  to  that  for  the  one-sided  differences, 
so  that  xp(r,c)  is  not  correlated  with  x(r1,c1),  xp(r1,c1),  and  *c(ri»c£)  either 
when  (r,c)  *  (r^Cj).  Clearly  xp(r,c)  has  mean  a  and  variance  o2/2(Ar)  . 
Likewise  one  has 


Xc(PjC)  =  6  +  -l— - - *  (3_40) 

from  which  one  sees  that  the  mean  of  xr(r,c)  is  $  and  that  the  variance  is 

2  2  ^ 
o  /2(ac)  .  Adjacent  points  in  a  row  might  be  represented  by  x(r,c)  and 

x(r,c  +  4ac)  in  this  model  and  adjacent  points  in  a  column  by  x(r,c)  and 

x(r  +  4at,c).  Note  that  this  imposes  a  certain  coarseness  on  the  mesh  that  was 

not  present  in  the  formal  difference  model.  However,  we  shall  see  that  the 

central  difference  model  has  the  advantage  of  simplicity.  The  basic  model 

equations  are  the  same  as  in  (3-24).  As  opposed  to  (3-2b),  however,  we  have  a 

much  simpler  covariance  matrix,  namely. 


n^r.c  +  Ac)  -  n^r.c  -  Ac) 


K  =  o2/2( Ar)2 


o2/2(ac)2 


2(Ar)‘ 


2(Ar)‘ 


.  (3-41) 


The  information  matrix  K"1  is 
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K 


l/2(Ar)2 

0 


0 


2 


a 


(3-42) 


The  negative  logarithm  of  the  likelihood  function  for  our  process  is 


-  log  L  =  MN  (31og  2n  +  log  (o^/4a2(Ar)^))/2 


+  I  [nj^ ( r .  ,Cj  )  +  2(Ar)2n/(ri,cj)  +  2(Ac)tn3t(ri .Cj)]/2ff\ 
1  $  J 


2  2, 


(3-43) 
2 


2 

to  be  minimized  over  (a,  B,  y,  a  ). 

Solving  the  normal  equations,  we  have 


=  1x1 


l  r.  x(r.,o)  +  2(Ar)2  i  .VVV 


M  (  £  r.2  +  2N(Ar)2) 
i  1 


(3-44a ) 


l  c  x(r  c  )  +  2(ac)2  l  x  (r  c  ) 

e  =  — - ? - ,  - 


N  d  c?  +  2M(ac)Y) 
j  J 


(3-44b) 


Y  =  l  x(r • ,c  •  )/MN. 
i  J  J 


(3-44c) 
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A  A  A 

One  notes  that  a,  8,  and  y  are  now  uncorrelated.  Furthermore,  the  variances  of 

A  A  A 

a,  8,  and  y  are,  respectively. 


var  (a)  =  o2(  J  r  2  +  2(at)2N)/M  (  J  r.2  +  2N(Ar)2)2 
i  i  1 

var  (8)  =  o2(  l  c.2  +  2(ac)2M)/N  (  £  c.2  +  2M(ac)2)2 
j  J  j  J 

var  (y)  =  o2/MN 


(3-4ba) 

(3-4bb) 

(3-4bc) 


The  rest  of  the  process  parallels  that  of  the  earlier  calculations,  so  that  we 
need  not  discuss  that  matter  further. 

Now  let  us  consider  the  circumstances  in  which  one  might  use  a  Sobolev 
version  of  the  sloped  facet  model.  For  illustrative  purposes,  we  examine  the 
forward  difference  formulation.  One  might  ask  the  following  question:  Since 
making  the  grid  coarser  could  improve  our  slope  estimates,  what  happens  when  we 
double  (or  perhaps  triple)  the  grid  size,  eliminating  half  (or  two  thirds)  of 
the  gray  level  information  from  consideration,  but,  at  the  same  time,  replacing 
that  knowledge  with  divided  difference  estimates  for  the  row  and  column  partial 
derivatives?  First  of  all,  note  that  a  least  squares  fit  to  intensity  data 
alone  corresponds  to  setting  Ar  and  Ac  both  to  zero  in  (3-3ba-b).  This  gives 


var  (a)  =  o2/M  £  r..2  (3-46a) 

var  (b)  =  o2/N  l  c.2.  (3-46b) 

i  J 

Next  let  us  see  what  happens  when  we  decide  to  eliminate  half  the  grid 
points.  For  purposes  of  illustration,  suppose  that  both  M  and  N  are  odd.  Then, 
for  example,  referring  to  Figure  3-4,  rows  might  be  indexed  from  top  to  bottom 
as  shown,  where  we  choose  to  eliminate  from  consideration  rows  r^,  r^,  and  r£. 
For  columns  the  analysis  is  similar,  proceeding  from  left  to  right.  In  general 
there  will  be  (N-l)/2  rows  and  (M-l)/2  columns  remaining,  and  the  new  grid 
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spacing  in  the  row  direction  will  be  2Ar,  where  Ar  is  the  original  spacing. 
Similarly,  there  is  a  new  column  spacing  2ac,  where  Ac  is  the  original  grid 
size.  It  is  here  that  one  might  use  the  original  grid  spacings  Ar  and  Ac  to 
procure  divided  differences,  in  line  with  our  previous  development.  According 

A  A  A 

to  (3-3ba-f),  we  obtain  a  covariance  matrix  for  the  estimates  o,  e,  and  y,  but 
based  on  (^-)  (^r-)  points.  The  relevant  quantities  can  be  calculated  from 
(3-3ba-f)  after  the  appropriate  substitutions  are  made.  We  note  that  our 
symmetry  conditions  l  r  *  ^  c  =  0  are  still  valid  after  the  modification  process 
just  described,  so  tftat  the  theory  is  still  applicable.  It  is  indeed 
conceivable  that  a  game  could  be  played  to  ascertain  just  how  many  points  could 
be  deleted  in  order  to  secure  a  meaningful  reduction  in  variance.  The  tradeoffs 
mentioned  before  with  regard  to  increase  in  grid  size,  corresponding  to 
improvement  in  slope  characteristics,  but  at  the  same  time  corresponding  to 
degradation  due  to  a  decrease  in  number  of  grid  points,  come  into  interesting 
interplay.  Note,  from  (3-2b),  that  doubling  of  grid  size,  leading  to  a  doubling 
of  mesh  size  for  divided  difference  considerations,  reduces  variances  by  a 
factor  of  four. 


FIGURE  3-4.  TESTING  THE  SOBOLEV  FACET  MODEL 
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If  one  wants  to  apply  Sobolev  modeliny  in  a  multispectra  1  framework,  some 
preprocessing  of  information  is  generally  required.  For  instance,  it  is  quite 
likely  that  red  and  blue  intensities,  for  the  same  pixel,  are  correlated.  Now 
suppose  that  we  want  to  use  a  linear  facet  model  description  for  gR(r,c)  and 
gB(r,c),  where 


gR(r,c)  =  aRr  +  eRr  +  yR  +  n1R(r,c) 
gB(r,c)  =  agr  +  6gr  +  Yg  +  nlg(r,c). 


(3-47) 


Having  an  estimate  for  the  covariance  matrix  C  of  the  noise  vector  (niR.nig)^, 
we  know  that  there  exists  a  2  by  2  orthogonal  matrix  P  which  converts  C  to  a 
diagonal  form,  i.e.. 


PCP1  =  D. 


(3-48) 


Therefore,  applying  P  to  (3-47),  we  have 


/9R'(r,c)\  _  ,9R(r,c)\ 

\gB  ( r , c ) /  p  \gB(r.c)/ 


with 


( 3-bO ) 


•3_0/l 


NSWC  TR  84 -b4 


so  that,  in  the  transformed  version,  the  covariance  matrix  (and  hence  the 
information  matrix)  is  diagonal.  Now  it  is  feasible  to  apply  the  Sobolev  norm 

A  A  A  A  A  A 

to  determine  estimates  oR',  BR‘,  yR‘,  ctB',  6g',  and  yb'.  There  will  be  a  b  by  6 

covariance  matrix  for  these  quantities  expressible  as  a  direct  sum  of  two  3  by  3 
covariance  matrices,  since  red  and  blue  information  are  decoupled  from  each 
other.  Once  these  estimates  have  been  determined  in  primed  coordinates, 
premultiplication  of  (3-49)  by  P*  gets  one  back  to  the  original  gray  level 
context,  and  the  variance-covariance  properties  in  the  unprimed  frame  are  easily 
obtained  through  linear  combinations  evolving  from 


For  example,  if 


(3-t>2) 


XR  =  P11“R  +  p12aB 


(3-b3) 


so  that 


A  ^  A  A  A  ^  A 

var(aR)  =  pu  var  (oR * )  +  2pnp12cov(oR '  ,ag ' )  +  p12  var(og').  (3-b4) 


It  must  be  realized  that  nothing  dictates  that  we  must  use  divided 
difference  approximations  for  derivatives.  In  fact,  as  was  done  in  one  of  the 
author's  reportslb,  we  may  use  any  estimates  that  we  have  available  in  the 
context  of  the  physical  situation. 
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We  conclude  this  chapter  by  remarking  that  what  we  now  have  is  a  first 
order  model  for  target  description  in  the  sense  that  the  mesh  is  assumed  so 
coarse  that  we  can  hypothesize  independence  of  noise  from  one  pixel  element  to 
the  next  and  in  the  sense  that  no  distinction  has  been  made  as  to  the  kind  of 
noise  present,  i.e.,  instrument  noise  and  process  noise  (cloud  effects,  spurious 
reflections  off  targets,  etc.)  have  not  been  distinguished.  The  separation  of 
such  noise  sources  leads  to  the  conception  of  Kalman  filtering  (and  other  kinds 
of  filtering  as  well).  The  next  chapter  will  be  devoted  to  deriving  two- 
dimensional  filters  in  the  context  of  multispectral  analysis  and  to  interfacing 
the  sloped  facet  logic  and  these  filter  models. 
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CHAPTER  4 

TWO-DIMENSIONAL  FILTERING  APPLIED  TO  COLOR  IMAGING 
THE  KALMAN  FILTER 

Before  embarking  on  our  discussion  of  the  two-dimensional  Kalman  Filter, 
let  us  again  adopt  a  philosophical  point  of  view.  Up  to  this  point,  we  have 
used  the  concept  of  the  quad-tree,  together  with  sloped  facet  modeling  and  F 
tests  of  significance,  in  order  to  ascertain  components  of  interest.  We  have 
observed  that  the  boundaries  of  the  components  flow  naturally  out  of  this 
procedure,  there  being  no  need  even  to  make  use  of  edge  detection  operators 
based  on  gradient  or  gradient-like  methods.  However,  we  also  noted  that 
inherent  in  the  methodology  was  the  assumption  of  independence  of  noise  between 
pixels.  This  assumption  seems  to  imply  a  coarse  mesh,  so  that  the  correlation 
between  points  can  be  neglected.  We  shall  see  that  a  Kalman  filter  is  based  on 

1  (j 

an  autoregressive  process  ,  so  that  some  kind  of  Markov  hypothesis  is  at 
work.  Therefore,  one  is  ^ble  to  refine  the  mesh  and  take  account,  in  a 
systematic  way,  of  the  noise  dependencies  thus  engendered.  Furthermore,  we 
shall  be  able  to  separate  out  the  effects  of  noise  entering  the  process  itself 
from  the  instrument,  or  observation,  noise.  Our  idea  will  be  to  use  the  sloped 
facet  model  results  in  two  ways:  (1)  In  order  to  derive  an  a  priori  covariance 
matrix  for  the  state  vector  so  as  to  implement  locally  the  two-dimensional 
Kalman  filter;  (2)  in  order  to  generate  initial  geometrical  information  about 
the  boundaries  of  components  of  interest.  Of  course,  the  sloped  facet  model 
itself  has  another  virtue:  Inasmuch  as  it  provides  a  first-order  description  of 
an  image,  that  description  might  be  good  enough.  On  the  other  hand,  if  one  is 
interested  in  higher  levels  of  detail,  where  one  needs  to  employ  a  fine  mesh, 
the  Kalman  filter  should  provide  an  improvement. 

g 

The  Kalman  filter  which  we  now  introduce  is  based  on  the  work  of  Woods 
and  Kaufman,  Woods,  Dravida,  and  Tekalp  .  It  will,  however,  be  generalized  to 
incorporate  the  aspects  of  the  color  imaging  procedure.  First  of  all,  one 
should  understand  the  meaning  of  the  state  vector  as  given  by  Woods.  He  defines 
it  as  the  minimum  amount  of  information  about  past  and  present  estimates  needed 


4-1 
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to  determine  an  optimal  causal  estimate  of  future  response  given  future  noisy 
observations.  Generally  speaking,  the  dynamic  model  for  this  case  is  given  by 


sjm)  =  Fs_(m-1)  +  Gu_(m) 
r_(m)  =  Hs_(m)  +  v_(m). 


(4-1) 


where  s_  is  the  state  vector,  _r  is  the  observation  vector,  ^is  the  observation 
noise  vector,  and  u_is  the  process  noise  vector. 

Woods  points  out  that  the  concept  of  state  vector  as  given  in  the  preceding 
paragraph  will  unfortunately  generally  lead  to  a  large  amount  of  computation 
time  in  situations  where  the  number  of  pixels  in  a  row,  N,  is  quite  a  bit  larger 
than  the  order  of  the  support  M  of  the  filter.  We  shall  now  define  more 
precisely  the  state  vector  and  the  support  M.  Underlying  the  model  is  the 
concept  of  a  scalar  autoregressive  process  given  by 


s(m,n)  =  l  c(k,£)  s(m-k,n-*)  +  w(m,n),  (4-2) 

(m-k ,n-£ )eRM(m,n) 


where 


RM(m,n)  =  L(m-k,n-t)  |  (1  <  k  <  M,  0  <  i  <  M) 

U(-  M<;k<U,  M)]. 


(4-3) 


The  region  RM(m,n)  is  indicated  by  the  dots  in  Figure  4-1.  Note  that 
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Woods  introduces  the  following  pseudo-state  vector  corresponding  to  a 
scalar  line  by  line  scan: 


!}(m,n)  =  [s(m,n),...,s(m-M,n);  s(m+M,n-l . s(m-M,n-l); 

s(m+M,n-M),...,s(m-M,n-M)]t. 


(4-5) 


In  his  updating  procedure,  he  chooses  only  to  calculate  ^(m,n)  at  each  step 
rather  than  s^m.n). 

Up  to  this  point,  we  have  merely  reviewed  the  work  of  Woods  and  his 
colleagues.  In  order  to  adapt  their  procedures  to  image  modeling,  several 
points  should  be  made:  (1)  -In  view  of  what  has  transpired  in  earlier  chapters, 
we  see  that  we  may  not  need  to  run  the  Kalman  filter  over  the  entire  image.  It 
may  very  well  be  sufficient  to  localize  it  to  a  subrectangle  of  interest 
containing  a  critical  component.  (2)  If  the  subrectangle  containing  our 
component  is  "small",  it  may  be  unnecessary  to  resort  to  the  Reduced  Update 
Kalman  Filter  of  Woods.  In  other  words,  use  of  the  state  vector  ^(m,n)  instead 
of  sj(m,n)  in  the  update  procedure  may  not  entail  that  much  more  work.  (3)  The 
sloped  facet  model  should  allow  us  to  build  a  reasonable  a  priori  covariance 
matrix  for  our  state  vector. 

In  view  of  the  above  remarks,  we  shall  not  emphasize  the  Reduced  Update 
Kalman  Filter  in  what  follows,  but  we  shall  review  how  one  estimates  the 
coefficients  c(k,ji)  and  the  process  noise  variance  Qw,  the  latter  of  which  is 
now  generally  assumed  to  vary  from  point  to  point.  Also,  it  should  be  mentioned 
that  the  state  vector  in  our  case  generally  consists,  as  before,  of  red  and  blue 
normalized  color  components,  together  possibly  with  estimates  of  the  first-order 
partial  derivatives.  These  are,  of  course,  recorded  throughout  the  support 
region,  as  reflected  in  (4-4). 

If  we  are  using  a  primary  color  representation,  possibly  together  with  the 
first  partial  derivatives,  we  must  employ  a  vector  analogue  of  (4-2).  For 
example,  one  might  have 
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9R  (m.n)  =  I 


(m-k,n-t )eRM(m,n) 


CR  (k .A )9r (m-k.n-A ) 


wR(m,n) 


(4-6) 


gR(m.n)  =  I  cR(k,*)gR(m-k,n-t)  +  wR(m,n), 

B  (m-k,n-t )eR  (m,n)  8  8  8 

M 

where  one  must  estimate  the  coefficients  cR(k,t),  Cg(k,£),  together  with  the 
noise  covariance  matrix  Qw  for  the  vector  (wR,  Wg)*.  We  would  assume  that  the 
vector  (wR,  Wg)^  is  uncorrelated  between  pixels,  but  that  wR  and  wB  might  be 
correlated  at  the  same  pixel.  If  one  has  first-order  partials  for  gR  and  gg  to 
be  considered,  then  there  would  be  four  more  relations  analogous  to  those  in 
(4—6)  to  be  invoked.  One  would  then  have  a  six  by  six  covariance  matrix  CL.  As 
in  Woods,  one  can  express  the  vector  s_(m,n)  as  (s^Cm.n,  s^t(m,n))t,  with  Sj 
given  by 


(m,n)  =  [gR  (m,n),  gB  (m,n),  gR  (m-I,n),  gB(m-l,n) . 

gR(m-M,n),  gB(m-M,n);  gR(m+M,n-l),  gB(m+M,n-l), 

...,  gR  (m-M,n-l),  gB(m-M,n-l);  ....  (4-7) 

gR(m+M,n-M),  gB(m+M,n-M) . gR  (m-M,n-M), 

gB(m-M,n-M)]t. 


The  vector  js^m.n)  is  then  just  the  remaining  part  of  the  state  vector  sjm,n), 
including  both  red  and  blue  components.  The  state  dynamical  model  now  becomes 

s_(m,n)  =  C  s_(m-l,n)  +  w(m,n),  (4-8) 

where  C  is  the  system  propagation  matrix  determined  by  (cR(k,t),  Cg(k,t)}  and  by 
the  ordering  of  the  state  vector  s(m,n).  The  process  noise  vector  is  given  by 
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w(m,n)  =  (wR(m,n),  Wg(m,n),  O,...^)1.  (4-9) 

There  is  an  observation  vector  equation 

£.(m,n)  =  Hsjm.n)  +  v_(m,n),  (4-10) 

where  £(m,n)  =  (rR(m,n),  r^m.n)1;  H  =  (J  J  ””’q)  , 

and  _v(m,n)  =  (vft(m,n),  vB(m,n))t.  Now  suppose  that  one  has  determined,  using 
the  quad-tree  analysis  and  sloped  facet  modeling,  a  preliminary  description  of  a 
certain  component  and  its  neighborhood  in  terms  of  the  facet  description,  as 
indicated  in  Figure  4-2.  Starting  at  the  upper  left-hand  corner  of  the 
rectangle,  one  determines,  first  of  all,  the  facet  corresponding  to  the  first 
pixel.  By  a  facet  corresponding  to  a  pixel,  we  shall  mean  the  following:  We 
identify  the 


FIRST  PIXEL  (1,1) 


CRITICAL  COMPONENT 


FIGURE  4-2.  LOCALIZING  A  CRITICAL  COMPONENT  FOR  KALMAN  FILTERING  STUDY 
facet  by  looking  at  the  center  of  the  pixel  and  the  particular  facet  to  which 
that  center  pertains.  Then  the  covariance  matrix  for  the  first  pixel  is 
ascertained  simply  by  computing  the  sample  covariance  matrix  over  all  pixels  in 
the  domain  of  that  facet  (or  possibly  in  some  prescribed  window  about  that 
pixel ).  For  example. 
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var  (gR(l,l))  =  l  (gR (i  ,j)  -  aRr.-  Lc,-  )2/N(F1 ),  (4-11) 

(i,j)eD(F1)  R  Ri  Rj  R  1 

where  D{Fj)  is  the  domain  of  facet  F^  corresponding  to  pixel  (1,1)  and  N(F^)  is 
the  number  of  pixels  in  the  domain  of  that  facet.  Also, 


cov  (gR(l,l),  gB ( 1 , 1 ) )  =  l  ( 9R ( i » J )  -  aRr.  -  lRc.  -  yR)  (4-12) 

(i ,  j  )eD(F  j )  *  *  a 

(gB(i.j)  -  aBri  -  BgCj  -  yb)/N(F1). 

Finally,  var(gB(l,l))  is  just  (4.11)  with  B  replacing  R  as  a  subscript.  We 
proceed  to  march  across  the  first  row,  computing  the  same  type  of  information, 
remembering  that,  under  first  order  modeling,  gray  levels  are  assumed 
independent  from  pixel  to  pixel.  In  any  event,  at  this  level,  let  us  ignore 
such  interpixel  correlations.  We  shall  finally  procure  a  covariance  matrix 
P(m,n)  for  some  pixel  (m,n)  which  we  shall  accept  as  an  initial  covariance 
matrix  to  start  the  vector  processor.  It  will  be  expressed  as  a  direct  sum  of 
two  by  two  matrices. 

Having  thus  described  how  sloped  facet  modeling  might  be  useful  in  starting 
the  Kalman  filtering  model,  let  us  proceed  to  show  how  one  estimates  the 
coefficients  cR(k,n),  Cg(k,*),  and  the  noise  covariance  matrix  Qw.  First  of 
all,  let  us  assume,  as  it  is  customary  to  do,  that  one  knows  the  characteristics 
of  his  measuring  apparatus,  so  that  he  can  establish  the  covariance  matrix  for 
the  measurement  noise,  namely. 


(4-13) 


2 

where  oyR  is  the  measurement  variance  for  the  red  component,  ayRg  is  the 

measurement  covariance  for  red  and  blue  (which  may  be  zero,  for  example),  and 
2 

a  R  is  the  measurement  variance  for  the  blue  component.  Following  the 

VD  q 

development  in  Kaufman's  paper  ,  it  turns  out  that  the  generalization  to  our 
situation  is  straightforward.  Let  us  introduce  the  matrix  J  given  by 


4-7 
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with  the  following  elucidation  of  notation:  rR(m,n)  and  rB(m,n)  are  simply  the 
measured  intensities  in  the  red  component  and  blue  component  at  pixel  (m,n), 
respectively.  The  column  vectors  of  red  and  blue  coupling  coefficients  are  c^ 
and  Cg,  as  given  in  equations  (4-6).  Note  that  they  are  defined  over  the  region 
Rf/|(m,n),  as  illustrated  in  Figure  4-1.  The  vectors  rR(*)(m-l,n)  and  rB^^(m- 
l,n)  are  pseudovectors  given  by 

Xg^^m-l.n)  =  [rR(m-l,n),...,rR(m-M-l,n);rR(m+M-l,n-l), 

. . . ,rR (m-M-1 ,n-l ) ; . . . ; rR (m+M-1 , n-M) , 

...,rR(m-M-l,n-M)]t 

(4-lb) 

£g(*)(m-l,n)  =  [rB(ni-l,n),...,rg(m-M-l,n);rB(m+M-l,n-l) 

.. .,rB (m-M-1, n-1);... ;rB (m+M-1, n-M), 

.. . ,rg(m-M-l,n-M)]t 

Now  we  also  know  that 

rR(m,n)  =  gR(m,n)  +  vR(m,n) 

(4-16) 

rB(m,n)  =  gB(m,n)  +  vB(m,n) 
and  that 
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£g(*)(m-l,n)  =  jg^^^(n\-l,n)  +  vg(^(m-l,n) 
rg^On-l.n)  =  j^1 )  (m-l,n)  +  vg^On-l.n), 


(4-17) 


where  j2g^)(m-l,n)  and  _cjg^ ) (m-l,h)  are  obtained  from  (4-15)  upon  replacing  r  by 
g.  Likewise  the  noise  vectors  Vj^^m-l.n)  and  vg^)(m-l,n)  are  found  by 
substituting  v  for  r  in  (4-15).  In  (4-16),  vR(m,n)  and  vB(m,n)  represent  the 
measurement  noise  at  (m,n)  in  the  red  and  blue  components,  respectively,  and 
gR(m,n)  and  gB(m,n)  are  given  by  (4-6).  Substitution  of  relations  (4-16)  and 
(4-17)  into  (4-14)  leads  to  the  fact  that 


J  =  Qw  +  ^v  + 


/%*  U\  g  /%  A 

v%7  v\°%/ 


(4-18) 


where  U  has  the  block  form 

V  <\ 


vR  1 

ctvRB 

vRB1 

2 

avB 

and  I  is  an  identity  matrix  of  appropriate  size.  This  matrix  is,  of  course, 
diagonal  when  ovRB  =  0.  Once  we  have  estimates  for  eg,  Cg,  and  J,  we  can  solve 
(4-18)  to  obtain  CL.  Following  the  development  in  Kaufman's  paper,  estimation 
of  J,  given  Cg  and  Cg,  is  straightforward.  One  simply  uses  a  window  around 
pixel  (m,n)  and  computes  sample  expected  values  as  approximations  to  true 
expected  values  in  (4-14). 


Let  us  next  discuss  the  estimation  of  Cg  and  Cg.  In  order  to  do  imbiased 

20 

image  parameter  identification,  we  shall  employ  observation  correlations 
Substituting  (4-16)  and  (4-17)  into  (4-6),  we  find,  in  matrix  notation,  that 
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/rR(m,n)\  /cRt  0  \/Vg^  (m-l,n\  /cRt  oX/^^^m-l.nA 
yrB(m,nj/  \0  Cg/y^ U) (m-l,n j )  \0  cBt/^B(1)(m-l,n)/ 

(4-19) 

VwR(m»n)\  /vR(m’n)\ 

l  Wg(m,ny  lvB(m,n  )/* 

Premultiplying  (4-19)  by  the  vector  (rR(m-k,n-£ ),  rg(m-k,n-£)),  i.e.,  taking  an 
inner  product,  we  find 


rR  (m-k,n-£ )  rR  (m,n)  +  rB  (m-k.n-t)  Tg(m,n) 

=  rR(m-k,n-£)  Cg1"  rg^(m-l,n)  +  rB(m-k,n-£)  Cgt  £g^  (m-l,n) 

(4-20) 

-  rR(m-k,n-£)  Cg*  Vg^(m-l,n)  -  rg(m-k,n-£)  Cg*  £g^(m-l,n) 

+  rR(m-k,n-£)  wR(m,n)  +  rB(m-k,n-£)  Wg(m,n) 

+  rR(m-k,n-£)  vR(m,n)  +  rg(m-k,n-£)  Vg(m,n) 

Taking  expectations  of  both  sides  of  (4-20)  and  choosing  k  and  i  sufficiently 
large  and  positive,  one  finds  that 


E  [rR(m-k,n-£)  rR(m,n)]  +  E  |>B(m-k,n-£)  rB(m,n)‘J 
=  Cg1*  E[rg^^(m-l,n)  rR(m-k,n-£)3 
+  £gt  E  Crg ^ 1 ^ (m-1 ,n )  rB(m-k,n-£)J. 


(4-21) 


Then  one  can  obtain  estimates  for  Cg  and  Cg,  namely,  Cg  and  Cg,  by  posing  a 
least  squares  problem  in  which  (4-21)  is  used.  This  completes  our  discussion  of 
the  Kalman  filtering  methodology  and  of  its  interconnections  with  the  sloped 
facet  model. 
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NONLINEAR  FILTERING 

Linear  autoregressive  models,  together  with  linear  measurement 
relationships,  imply  utilization  of  a  linear  Kalman  filter  for  smoothing 
purposes.  On  the  other  hand,  nonlinear  autoregressive  models  will  implicate 
nonlinear  filtering  techniques  (or  perhaps  extended  Kalman  filter  algorithms). 
For  example,  one  might  have 


gR(m,n)  =  PRL{gR (m-k ,n-t )} ]  +  wR(m,n) 
gB(m,n)  =  Pg[{ % (m-k ,n-£ )} J  +  Wg(m,n), 


(4-22) 


where  PR  and  Pg  are  red  and  blue  polynomials,  respectively.  There  may  also  be 
some  more  general  functional  relationship  among  the  intensities.  It  will  be 
understood,  as  before,  that  (m-k,n-t)  e  RM(m,n). 


We  shall  now  introduce  a  completely  nonlinear  filtering  algorithm.  It  is 
based  on  Bayes'  theorem  and  goes  as  follows:  Let  us  suppose  that,  at  pixel 
(m,n),  one  has  the  state  vector  ^jm,n)  given  in  the  manner  heretofore  prescribed 
with  alternating  red  and  blue  components.  Let  pb(s_(m,n))  be  the  probability 
density  for  s_(m,n)  before  updating  (i.e.,  before  invoking  the  measurement  at 
(m,n)).  Assume  a  measurement  model  of  the  form 


rR(m,n)  =  hR(gR(m,n))  +  vR(m,n) 

(4-23) 

rB(m,n)  =  hB(gg(m,n))  +  vg(m,n), 

where  hR  and  hg  may  be  nonlinear  functions.  Let  _r(m,n)  =  (rR(m,n),  rB(m,n))t. 
We  shall  assume,  as  before,  that  (vR.Vg)*  is  a  zero-mean  Gaussian  vector  with 
covariance  matrix  Qv.  Suppose,  for  simplicity,  that  pb  is  also  a  Gaussian 
density.  Then  the  probability  density  pa  (s_(m,n)  (_r_(m, n ) ),  namely,  the 
probability  of  the  state  vector,  given  the  most  recent  measurement,  is  just 

pa(sjm,n)jr_(m,n))  =  p(_r(m,n) ) sjm,n) )  pb(s_(m,n) )/p(r_(m,n) ).  (4-24) 


4-11 


From  ( 4—23 ) , 


p(r(m,n)|s(m,n))  = 
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|t(V  v8>  Qv‘1(vg): 


(4-2S) 


Also,  if  pb  is  Gaussian,  we  have 


p.  (s(m,n))  =  — - 

D  /  o 


-  j  (s(m,n)  -  |(m,n))tK"1(s(m,n)  -  |(m,n)) 


(2w,(NHW1)/Z|k 


,(4-26) 


where  K  is  the  covariance  matrix  and  s(m,n)  is  the  expected  value  of  s_(m,n) 
before  updating  at  (m,n).  Finally,  p(r_(m,n))  is  just  the  marginal  density 
obtained  by  integrating  the  numerator  of  (4-24)  with  respect  to  the  state  vector 
s_(m,n).  Note  that  vR  and  vB  in  (4-2b)  are  to  be  replaced  by  the  differences 
rR(m,n)  -  hR(gR(m,n))  and  rg(m,n)  -  hg(gB(m,n)),  respectively. 

Now  that  we  have  pa(g(m,n)|r(m,n)),  how  do  we  advance  the  process?  The 
clue  is  to  use  (4-22)  with  m  replaced  by  m+1  if  we  are  not  at  the  end  of  a  row 
and  with  n  replaced  by  n+1  and  m  replaced  by  1  otherwise.  We  shall  illustrate 
the  process  when  we  are  not  at  the  end  of  a  row.  In  that  case,  one  might  set  up 
the  system  of  relations 
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gR(m+l,n)  =  PR[{gR(m+l-k,n-i)} ]  +  wR(m+l,n) 
gB(m+l,n)  =  Pg[{gB(m+l-k,n-i)} ]  +  wB(m+l,n) 


gR(m,n)  =  gR(m,n) 
gB(m*n)  =  gB(m,n) 

9RU,n)  =  gR(l,n) 

9B ( 1 » n )  =  gB(l,n)  (4-27) 

gR (N , n-1 )  =  gR(N,n-l) 
gB(N,n-l)  =  ge(N,n-l) 

gR(l,n-l)  =  gR(l,n-l) 
gB(l,n-l)  =  gB(l,n-l) 

9R (N,n-M)  =  gR(N,n-M) 
gB(N,n-M)  =  gB(N,n-M) 

gR(m+l-M,n-M)  =  gR(m+l-M,n-M) 
gB(m+l-M,n-M)  -  gB(m+l-M,n-M), 


representing  a  transformation  from  the  set  (wR(m+l,n),  wB(m+l,n),  sR(m,n)t,  Sgfm.n)1} 
to  the  set  (gR(m+l,n),  gB(m+l,n),  s^m.n)1,  Sgfm.n)1}.  where 


lR(m»n)  ( gR (i4 * n )»•••» gR ( 1  , n ) ,  9R(N»n-l),...,gR(l,n— 1),..., 


gR(N,n-M gR(m+l-M,n-M)) 


(4-28 ) 


and  Sgfm.n)1-  is  obtained  upon  replacing  R  by  B  in  (4-28).  The  Jacobian  of  the 
system  (4-27)  is  just  unity.  Therefore,  the  probability  density  function  for 
the  state  vector  (sR(m,n)t,  sB(m,n)t,  wR(m+l,n),  wB(m+l,n))t,  with  wR(m+l,n)  and 
wB(m+l,n)  replaced  by  gR(m+l,n)  -  PR[{gR(m+l-k,n-t )} ]  and  gB(m+l,n)  - 
PB*-{9B^m+1’k’n_il)}-1,  respectively,  will  furnish  the  density  at  the  next  point 
before  the  measurement  update.  If  (4-24)  is  then  applied  with  m+1  in  place  of 
m,  the  process  will  have  been  successfully  advanced.  Of  course,  as  the  process 
moves  along,  the  expected  state  and  covariance  matrix  structure  can  be  computed 


4-13 
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using  multiple  quadrature  methods.  For  example. 


E(s(i  ,j ) )  =  /  /  .../  s(i  ,j)pa(s(m,n)|r(m,n))ds(m,n). 


(4-29) 


j 

! 


where  s(i,j)  is  a  component  of  the  vector  s_(m,n).  Also, 

oo 

E(s(i  ,j))s(i1,j1)  =  //  ...  /  s(i,j)s(i1,j1)pa(s(m,n)|r(m,n))ds(m,n),(4-30) 


where  s ( i , j )  and  s(i i» Ji )  are  both  elements  of  s_(m,n).  A  simplified  version  of 
this  general  process  is  obtainable  by  approximating  the  density  of  (4-24)  by  a 
Gaussian  density  after  computing  relations  (4-29)  and  (4-30).  A  variety  of 
Monte  Carlo  integration  methods  may  be  invoked  to  perform  the  integrations  in 
(4-28)  and  (4-29).  Of  course,  the  parameters  of  the  model  (4-22),  i.e., 
coefficients  like  the  Cj^'s  in  the  linear  model  and  the  process  noise  covariance 
matrix,  would  have  to  be  obtained  in  some  fashion,  perhaps  analogously  to  the 
manner  in  which  they  were  secured  in  the  linear  version.  This  completes  our 
discussion  of  filtering  methodology. 


NSWC  TR  84-54 


CHAPTER  5 

IMAGE  BOUNDARY  ESTIMATION 

Having  introduced  methodology  for  smoothing  the  image,  we  still  have  ahead 
of  us  the  tasks  of  properly  segmenting  the  target  and  of  identifying  the  target 
based  on  our  understanding  of  the  types  of  components  thus  obtained  and  of  their 
positions  relative  to  one  another.  Of  course,  in  our  discussion  in  Chapter  3, 
we  observed  that  a  first  order  segmentation  of  the  target  would  fall  out  of  a 
region  analysis  as  a  natural  by-product.  In  the  present  chapter,  we  shall  show 
how  geometrical  information  gained  from  this  first  order  approach  can  be  input 
to  a  more  sophisticated  procedure  for  estimating  the  boundary  which  makes  use  of 
the  output  of  the  2D  Kalman  filter. 

We  shall  invoke  the  methodology  due  to  Nahi  and  Jahanshahi Their  theory 

21 

is  build  around  the  idea  of  using  a  replacement  process  in  order  to 
distinguish  an  object  from  its  background.  In  their  work,  a  brightness  function 
b(m,n)  was  defined  and  conceived  of  as  a  combination  of  an  object  brightness 
function  bQ(m,n)  and  a  background  brightness  function  bb(m,n)  as  follows: 

b(m,n)  =  Y(m,n)boR(m,n)  +  [l-Y(m,n)]bbR(m,n),  (5-1) 

where  y(m,n)  =  1  or  0  depending  upon  whether  or  not  the  pixel  (m,n)  is 
considered  to  be  a  point  of  the  object  or  a  point  of  the  background.  Therefore, 
the  real  game  was  that  of  determining  the  statistical  properties  of  y  and  using 
them  to  advantage  for  object  recognition  purposes. 

Since  our  model  is  to  be  based  on  the  idea  of  using  color  imaging,  let  us 
introduce  the  following  modification  of  (5-1): 

bR(m,n)  =  Y(m,n)boR(m,n)  +  [1-y (m,n)]bbR(m,n) 

(5-2) 

bB(m,n)  =  y(m,n)boB(m,n)  +  [1-y (m,n)]bbB(m,n), 

where  the  task,  as  before,  will  be  that  of  using  the  statistical  properties 
of  y  in  order  to  determine  whether  or  not  (m,n)  is  a  point  of  the  object.  Again 


5-1 
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y(m,n)  =  1  when  (m,n)  lies  in  the  object,  and  y(m,n)  =  0  when  (m,n)  is  outside 
the  object.  In  addition,  background  likewise  generally  has  color 
characteristics.  Therefore,  we  should  observe  some  fundamental  level  of  red  and 
blue  in  the  background  as  well. 


As  was  done  in  the  Kalman  filtering  analysis,  it  is  convenient  for 
processing  purposes  to  think  in  terms  of  row  by  row  (or  perhaps  column  by 
column)  scanning.  In  the  Nahi-Jahanshahi  paper,  it  was  assumed  that  the  object 
whose  boundary  one  would  like  to  ascertain  is  horizontally  (or  possibly 

p 

vertically)  convex.  An  object  ECR^  is  said  to  be  horizontally  convex  if,  for 

(1)  ,  1  lv  c  (2)  ,  2  2.  _  _  1  2.1  2  (1)  ,  (2) 

x  =  (xj  ,x2  )eE,x  =  (xj  ,x2  )eE,  with  Xj  *  Xj  and  x2  =  x2  ,ox  '+(l-a)x 
eE,  U  <  a  <  1.  An  example  of  a  horizontally  convex  object  is  given  in  Figure 
5-1.  However,  this  hypothesis  may  very  well  be  over-restrictive.  For  example,  a 


FIGURE  5-1.  A  HORIZONTALLY  CONVEX  OBJECT 
character  like  the  letter  N  shown  in  Figure  5-2  is  neither  horizontally  nor 
vertically  convex.  We  would  like  to  be  able  to  treat  such  objects  in  the 
present  context.  Also,  objects  with  holes  should  be  treatable.  Therefore,  we 
shall  provide  a  generalization  of  the  Nahi-Jahanshahi  theory  in  this  regard. 
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SG 


FIGURE  5-2.  A  NON-HORIZONTALLY  CONVEX  SET 

We  shall  assume  that  we  have  used  sloped  facet  modeling  in  order  to  isolate 
a  component  of  interest,  together  with  a  first-order  approximation  to  its 
boundary.  Suppose  next  that  we  have  localized  the  component  to  a  rectangle 
(perhaps  one  in  the  quad-tree  analysis  described  in  Chapter  3).  Our  task  will 
be  to  describe  the  use  of  a  vector  filter  processor  in  conjunction  with  a 
boundary  estimation  scheme  similar  to  that  of  Nahi  and  Jahanshahi  in  order 
properly  to  estimate  y(m,n).  The  replacement  concept  is  again  to  be  invoked, 
whereby  we  run  the  Kalman  or  nonlinear  filter  processor  for  both  the  background 
and  the  object  individually.  The  scanner  output  is,  in  scalar  form. 


sR(k)  =  x(k)  sQR(k)  +  [1-x (k )3  sbR(k) 


sB(k)  «  X (k)  sqB ( k )  +  [1-X (k )]  s bB ( k ) , 


(5-3) 


where  soR(k)  is  the  object  red  component  gray  level  at  pixel  k  and  sbR(k)  is  the 
background  red  component  at  point  k.  Similar  remarks  can  be  made  about  the  blue 
component.  We  shall  use  a  number  of  lines  of  output  from  the  vector  processor 
at  most  equal  to  the  order  M  of  the  filter;  and,  as  we  execute  the  filter,  we 


■/■.V.'  ■.'  v.-.'  -.' % 1  • 
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shall  attempt  dynamically  to  produce  a  Nahi -Jahanshahi  type  boundary  estimation 
procedure. 

Now  let  us  proceed  with  the  details  of  the  estimation.  The  statistics  of 

the  process  x(k)  are  to  be  obtained.  We  assume  that  we  have  before  us  a  first 

order  approximation  to  the  boundary  as  given  through  a  sloped  facet  approach. 

The  geometrical  information  thus  provided  is  to  be  input,  together  with  the 

Kalman  or  nonlinear  filter  results,  in  determining  a  better  boundary  estimate. 

In  the  Nahi -Jahanshahi  formulation,  the  problem  was  to  determine,  for  each  scan 

line  r,  a  vector  w  =  (a  ,8  ),  where  «  is  the  first  boundary  element 
r  r  r  r 

encountered  by  the  scanner  and  8p  the  second  boundary  element  encountered  by  the 
scanner.  This  representation  is,  of  course,  somewhat  restrictive,  as  it 
incorporates  the  concept  of  horizontal  convexity.  Let  us  suppose,  as  an 
alternative,  that,  based  on  our  first  order  boundary  description,  we  agree  to 
subdivide  the  rectangle  containing  the  component  of  interest  into  a  number  of 
horizontal  sections  in  each  of  which  we  have  a  given  number  np  of  boundary 
points  per  scan  line,  1  <  p  <  NS.  Here  NS  is  the  number  of  sections. 

Furthermore,  let  us  assume  that  the  number  of  lines  in  each  section  p  does  not 
exceed  the  order  M  of  the  autoregressive  process  involved.  Our  idea  is  to  use 
the  same  type  of  methodology  proposed  in  the  Nahi -Jahanshahi  paper,  but  to  do  so 
to  better  advantage  by  appealing  to  a  more  thorough  preliminary  geometrical 
description.  To  establish  notation,  let  mjp  be  the  first  scan  line  of  section  p 
and  m2p  the  last  scan  line.  Also,  let  w^  =  (alrp.  a2rp’***’an  rp)  represent 

the  vector  of  boundary  points  in,  for  example,  a  left-right  scan  of  the  rth  line 
of  section  p.  We  shall  run  through  the  entire  process,  first  of  all,  for 
section  1.  Let  us  be  clear  concerning  the  statistical  model  to  be  used. 

Suppose  that  we  base  it  on  a  linear  autoregressive  model  such  as  that  given  in 
(4-6).  Then  one  has  a  state  dynamical  system  afforded  by  (4-8)  with  the 
measurement  model  (4-10).  In  particular,  consider  equation  (4-4)  with  the  pair 
(m,n)  replaced  by  (M+l.M+l).  Generalizing  to  the  colored  picture  representation 
one  has 


c_/i 
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s_(M+l,M+l)  =  CgR (M+l.M+l ),  gB(M+l,M+l),  gR(M,M+l),  gB(M,M+l), 

•••.  9R(1,M+1),  gB(l,M+l);  gR(N,M),  gB(N,M) . gR(l,M),  ge(l,M); 

(5-4) 

•••»  9r(n»1)»  gB(N,l),  ••••  gR(i,i),  gB(i,i)]t. 

Note  that  ^(M+l.M+l),  which  we  define  to  be  s_(M+l,M+l)  stripped  of  its  first 
2(M+1)  elements,  represents  the  state  of  the  first  M  rows.  Equation  (4-8)  with 
(m,n)  replaced  by  (M+l.M+l)  is,  of  course, 

s_(M+l,M+l )  =  Cs_(M,M+l )  +  w_(M+l,M+l)  (b-5) 

Forming  a  vector  from  the  2-vectors  _r(M+l,M+l),  _r_(M,M+l),  ....  _r_(  1,1),  we  have 

r^M+l.M+l)  =  (^(M+l.M+l),  r1  (M,M+1 ) . rt(l,l)t.  (5-6) 

The  measurement  equation  which  we  now  invoke  is 

r^M+l.M+l)  *  s(M+l,M+l )  +  yx (M+1,M+1),  (5-7) 

where 


vx (M+l.M+l )  =  (vt(M+l,M+l),  v^M.M+l),  ....  yt(l,l))t. 


(5-8) 


We  then  run  the  model  (5-5),  together  with  (5-7),  for  both  the  background  and  the 
object.  That  is,  we  prepare  to  execute  the  replacement  process.  Let 
s+(M,M+l)  be  the  expected  state  vector  after  the  measurement  update  at  pixel 
(M,M+1 )  and  s_(M+l,M+l)  the  expected  state  vector  at  (M+l.M+l)  before  the 
update.  Since  the  estimate  before  updating  is  obtained  by  using  the  transition 
matrix  C  to  propagate  the  state,  one  has  s_(M+l,M+l)  =  Cs+(M,M+1).  Consider  then 
the  equation 


rj(H+l,H+l)  =  Cs+(M,M+1 )  +  y2(M+l,M+l) 

=  s_(M+l,M+l)  +  v?(M+l,M+l). 


(5-9) 
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One  may  check,  in  a  straightforward  manner,  that  has  mean  zero.  It  remains  to 
determine  the  covariance  matrix  for  V£.  One  has 

E(v2(M+l,M+l)v2t(M+l,M+l))=E[(r1(M+l,M+l)-s_(M+l,M+l))(r1(M+l,M+l)-s_(M+l,M+l))t] 


=  E[(v1(M+l,M+l)  +  s(M+l,M+l )  -  s_(M+l,M+l)) 


(v^M+l.M+1)  +  s(M+l,M+l)  -  s_ (M+l.M+l))1] 

=  EC ( yx (M+l ,M+1 )  +  s_(M+l,M+l)) 

.(v^M+l.M+l)  +  s_ (M+l.M+l))1] 

=  Q  +  P  (M+l.M+l), 

V1  - 

where  s  =  s  -  s  . 


(5-10) 


It  is  seen  that  the  term  E(v1(M+l,M+l)  s_(M+l,M+l)t)  involved  in  (5-10)  is  zero, 
since  s_(M+l,M+l)  does  not  incorporate  the  measurement  update  at  (M+l.M+l). 
P_(M+1,M+1 )  is,  by  definition,  the  state  covariance  matrix  at  (M+1,M+1)  before 
updating.  Now  let  us  strip  off  the  first  2(M+1)  elements  of  each  of  the  three 
vectors  appearing  in  (5.9).  This  leads  to  an  equation 


rld(M+l,M+l)  =  s_d(M+l,M+l)  +  v2d(M+l,M+l),  (5-11) 


where  d  represents  the  deletion.  The  covariance  matrix  for  V2d  is,  of  course, 

just  the  appropriate  principal  submatrix  of  (5-10).  Let  Q  be  that  covariance 

*2d 

matrix,  and  suppose  that  we  find  an  orthogonal  matrix  A  which  diagonalizes 

0  .  That  is,  we  have 

-2d 

D (x )  =  diag  (A,, A,,,...)  =  AO  AT. 

1  c  -2d 


5-6 


(5-12) 


If  we  pre-  and  post-multiply  by 

-1  _L 

D^(x)  -  diag  (x^  ,  A2  »•••)» 

we  have 

I  =  D, (x )AQ  AtD 1 (X ) .  (5-13) 

1  -2d  1 

Thus,  we  may  premultiply  (5-11)  by  D^(x)A  and  obtain 

C1(J  (M+l.M+l)  =  s^M+l.M+l)  +  v^M+l.M+l),  (5-14) 

where  now  ^(M+l.M+l)  has  the  identity  matrix  as  covariance  matrix. 

Equation  (5-14),  in  conjunction  with  the  replacement  concept,  may  be 

construed  in  the  following  way:  After  executing  the  sloped  facet  model,  suppose 

that  we  have  defined  an  object  and  its  boundary  to  first  order  accuracy  and  have 

localized  the  object  to  a  rectangle.  Realizing  that  what  we  want  to  do  is  to 

perturb  the  boundary  using  the  replacement  concept,  we  agree  to  use  the 

covariance  matrix  Q  above  and-  to  apply  the  matrix  Di(x)A  in  order  to 

-2d 

transform  the  state  vector.  We  map  the  background  and  object  intensities 
by  Dj(x)A,  and  we  apply  the  replacement  principle  to  the  transformed 
quantities.  The  important  point  is  that  the  noise  covariance  is  provided  by 
(5-10)  as  applied  to  a  first  order  description  of  the  object  or  background, 
respectively.  Such  covariance  may  be  computed  without  exercising  the  filter. 

The  filter  itself  is  to  be  executed  for  both  the  background  and  the  object 
processes  in  the  case  where  either  fills  the  entire  rectangle.  The  object 
process  is  to  be  construed  as  that  pertaining  to  the  actual  scene  consisting  of 
object  as  influenced  by  background  and  background  as  influenced  by  object.  The 
background  process,  on  the  other  hand,  is  that  obtained  after  subtracting  an 
object's  influence.  An  example  is  that  of  a  ship  subject  to  environmental 
effects.  Taking  out  the  ship,  one  has  only  background  information.  The  ship 
plus  its  background  constitutes  the  object  process.  Now  there  are  two  kinds  of 
boundaries  with  which  we  are  concerned:  external  and  internal.  For  an  external 
boundary,  such  as  that  serving  to  outline  a  ship,  there  is  no  problem  with 
employing  the  usual  replacement  process.  However,  once  internal  boundaries  are 
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considered,  one  has  to  modify  his  notion  of  background.  In  effect,  if  we  know 


the  boundary  of  the  component  of  interest,  then  we  would  want  to  extrapolate  the 
gray  scale  values  for  the  rest  of  the  target  across  that  boundary.  Then  our  new 
background  would  consist  of  the  original  background  external  to  the  target,  the 
gray  scale  values  of  the  target  itself  minus  the  component,  and  certain 
artificial  values  created  by  extrapolation  into  the  "removed  component."  Now  a 
Kalman  filter,  in  conjunction  with  a  reasonable  first-order  description  of  the 
boundary,  can  be  used  to  "create  the  background."  To  do  this,  one  may  conceive 
of  the  removal  of  the  component  as  synonymous  with  the  deletion  of  data. 
Therefore,  one  needs  to  run  a  Kalman  filter  (or  other  filter)  over  the  object  in 
which  data  inside  the  first  order  boundary  estimate  mentioned  above  have  infinite 
(very  large)  measurement  and  process  variance.  This  procedure  will  have  the 
desired  effect  of  propagating  the  gray  scale  estimates  into  the  gap  created  by 
the  removed  part.  In  this  way  one  could  conceivably  improve  on  one's 
understanding  of  internal  boundaries.  In  the  following  discussion,  we  shall  give 
the  details  of  the  boundary  estimation  method. 

We  are  now  in  a  position  to  pursue  the  analysis  further.  From  the  above 
remarks,  we  shall,  without  loss  of  generality,  assume  a  model  of  the  form 

X  =  S_(H,  W.)  +  U_»  ( b-lb ) 

where  X  is  the  set  of  measurements  of  both  red  and  blue  intensities  for  the  first 
M  rows,  X  is  the  state  vector  which  starts  at  row  m^  and  ends  at  row  m^,  there 
being  M  or  fewer  rows  altogether.  Now  remember  that  the  first  section  of  the  NS 
sections  is  characterized  by  the  presence  of  nj  boundary  points,  where 
Wj,}  denotes  the  vector  of  boundary  points  in  the  rth  row  of  that  slice. 

Generally  speaking,  n  will  be  an  even  integer;  but,  when  two  boundary  points 
coalesce,  a  special  case  arises  in  which  the  number  of  distinct  points  is  odd.  X 
is  a  vector  formed  from  •  •  • »  UP  t0  the  number  of  rows  in  the  section,  and 

X  is  the  noise  term  whose  covariance  matrix  is  the  identity  matrix. 

We  want  to  be  a  bit  more  specific  about  the  vector  S_( *  s->nce  we  need  To 
distinguish  between  points  within  the  component  of  interest  and  those  outside  of 
it.  In  conformance  with  the  notation  in  the  Nahi -Jahanshahi  paper,  let 
sbR ( k )  and  s^k)  represent  the  red  and  blue  intensity,  respectively,  at  a  point 
outside  the  component,  i.e.,  in  the  background.  Also,  let  $0p(k)  and  sog(k)  be 
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the  analogous  quantities  at  object  (component)  points.  As  in  the  above  quoted 
paper,  let  J  be  the  number  of  columns  of  pixels.  Then,  for  the  first  mu-1  rows 
(m^  being  thought  of  as  random),  all  the  s  values  are  background  quantities; 
whereas,  for  m^*  r  <  m^,  one  has  the  following  row  pattern: 

sbRC(r-1)J+l!l,  sbBC(r-l)J+l],  ...,  sbRC(r-l )J+olr-l], 

sbBL(r‘1)J+alr'13’  SoR[(r'1)J+alr:i»  SoB[(r'1)J+alr]*  — » 
soR^r"1^J+ot2r3,  soB^r“l3J+0l2r3*  sbR^r_1  ^J+ct2r+l3, 

(5-16) 

sbB^r"1^J+a2r+l3,  ****  sbR^r“1^J+ot3r'13’  sbB^r_1  ^J+a3r‘l3, 
soR[(r-l)J*«3r].  s08C(r-lj4"3r;i>  — •  soRC(r'1)J*“n1r]l 

soB[(r-l)J+.„ir],  sbRt(r-l)J%ir+U.  s^Kr-DJ^^r+O. 

.  .  .  ,  ^  )  * 


Note  that,  in  our  formulation,  m2i  is  generally  not  random.  We  shall  attempt  to 
maximize  the  joint  probability  density  function  p(Y_,  M,  W_) »  which  is  just 


p(I,M,W_)  =  p(Y_|M,Wj  p(WjM)  p(M). 

If  U_is  assumed  to  be  Gaussian,  we  have,  therefore, 
p(Ll’}D  =  (2*)‘N/2  exp  {-  j  [Y  -  S(M,W)D‘[Y  -  S(M,W)3 

+  In  p(W)M)  +  In  p(M)} , 


(&-17) 


(5-18) 


where  N  =  2Jm2i«  Maximizing  (5-18)  with  respect  to  M_and  W_is  equivalent  to 


5-9 
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max{-  i  [Y  -  S(M,W)]'[Y  -  S(M,W)] 
M,W 

+  In  p{W|M)  +  In  p(M)}, 


(5-19) 


or 

min  {S 1 (M,W)  [S(M,W)  -  2Y]  -  2  In  p(W|M>-  2  In  p(M)}.  (5-20) 

M,W 

Let  us  transform  the  first  of  the  three  terms  within  the  curly  brackets  of  (5-20) 
into  scalar  notation.  We  have  then 


S_' [S(M_, WJ  -  2Y_] 


"’ll-1  rd 


•l  l 

r=l  k=(r- 


1)J+1 


MM 


m21 
+  I 

r=m 


11 


(r-l)J+air-l  (r-lJJ+a^r 


=1- 


k=(r-l)J+l 


Kh(k)  ♦  l 


k=(r-l)J+alr  0 


MM 


( r-1 )J+a 
+  ...  +  l 


nlr 


.  MM 


rJ 

+  l  MM  I  , 

k=(r-l )J+a_  +1  D 


(5-21) 
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Mk>  =  KbR(k)  +  Wk) 

K0(k)  =  KoR(k)  +  KoB(k), 
with 

KbR^k^  =  sbR^k^  t-sbR^k^  "  2y R < k > ^ 
KoR(k)  =  soR(k)  CsoR(k)  ‘  2*R(k)] 


(5-22) 


(5-23) 


and  relations  similar  to  (b-23)  for  the  blue  components.  Note  that  the  summands 
in  ( 5-21 )  are  all  known,  the  unknowns  being  the  a's  and  m^ 


Now  let  us  add  to  and  subtract  from  (b-21)  the  following: 

oi2i  (r-l)J+a2r  (r-l)J+a4r 

I  C  1  K  (k)  ♦  T  K  (k) 

r=mll  k=( r-1 )J+a  k=(r-l)J+a2r 


nlr 


(r-l)J+a, 
k  =  (r-l)J4ani_Ur 


Kb(k)3 


(5-24) 


(b-19)  is  thus  converted  to 

m21  rJ 

1'  (M, W_)  ll(M,W)  -  21]  =  l  I 

r=l  k=(r-l)J+l 


Vk> 


m21 
+  l 

r=m 


11 


( r-1  )J+a, 


2r 


l 

k=(r-l)J+alr 


[KQ(k)  -  Kb(k)3 


(r-l)J+a4r 

+  l  CK  (k )  -  K  (k)3 

k=(r-l)J+a3r  0 

( r-1  )J+a 


nlr 

+  ...  +  l  [k  (k)  -  K.(k)3  . 


(b-2b) 


-11 
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The  first  group  of  terms  in  (5-25)  does  not  depend  on  mu  or  the  a's,  so  they  may 
be  eliminated  from  consideration  in  (5-20).  In  other  words,  remembering  that  n^ 
is  generally  a  positive,  even  integer  and  letting 


T^)  = 


Hj/2  (r-l)J+a2j  r 

l  1 

j=l  k>(r-l)J4»2j.i>r 


[Ko(k)  -  K&(b) J, 


one  wants  to  consider  the  minimization  problem 


m2i 

min  [-21n  p(M)  -  21n  p(wjfi)  +  l  T(w  )].  (5-26) 

M,W  r=mn 


Now  let  us  examine  the  nature  of  p(M_)  and  p(wjM_).  First  of  all,  p(M_)  is 

just  pfmjj)  for  section  1,  since  m2 j  is  preassigned.  This  reflects  the  fact  that 

we  do  not  know  with  certainty  the  first  row  of  the  component.  Also,  in  general, 

let  us  suppose  that  boundary  point  asrp»  namely,  boundary  point  s  in  row  r  of 

section  p,  is  conditioned  only  on  a  ,  and  o  ,  its  nearest  boundary 

s-i >  r  ,p  s , r-i ,p 

points  "preceding"  it  in  row  r  and  r-1,  respectively.  It  follows  that 


p(w|n)  •  p(w„  .I"  i>  P<<V,.ib.  -2> 


"21  21 


21 


21 


•••  P(w„  4.1  |w„  )  p(w„  ), 

r  — rn^+l'— m^j  "^li 


(5-27) 


where,  generally. 
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P^rl^-l)  =  p(alr,a2r . an1rl0l,r-l»a2,r-l*#,*0n1,r-l^ 


From  (5-27)  one  has 


p^an1rlotn1-l,r’an1  ,r-l^  *p^an,  -l,rlan1  -2,r’an, 

111  ill  (5_28) 

•  p(an1-2,rl“n1-3>r’  “n^.r-l*  *•* 

•  p(alrl°l,r-l)* 


m21 

In  p(WjH)  =  I  In  p(wr ) , 


(5-29) 


where  p(w  |w  ,)  =  p(w  ).  Substitution  of  (5-29)  into  (5-26)  yields 
“tT,ll,~mil-i  "^ll 


min  (-2  In  p(mn)+  J  [T^)  -  2  In  ptwjv^.j)]} 


(6-30) 


The  reader  will,  of  course,  observe  that  what  we  have  done  so  far  is  to  use 
the  Nahi -Jahanshahi  analysis  in  a  slightly  more  general  format.  We  shall 
continue  to  follow  the  general  outline  of  their  modeling  procedure.  Therefore, 
let  us  fix  mjj  in  (5-29)  equal  to  that  dictated  by  the  sloped  facet  model 
result.  Then  all  we  need  do  is  minimize  with  respect  to  the  sum  of  the  terms 
from  m^  to  m2i  indicated  in  (5-30)  Let  us  consider,  first  of  all,  the  natural 
logarithm  of  (5-28).  We  have 


"I 

In  =  .I=i  In  p(air|o..ljr,  a^^), 

so  that  the  minimization  problem  which  we  would  like  to  consider  is 


(*>-31 ) 


m21  n2 

min  {  l  [T(w  )  -  1  \  In  p(a.  |a.  ,  ,  a.  J]} 


(5-32) 


v  v  V  V 
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The  problem  is  that  of  providing  a  recursive,  easily  implementable  procedure  for 
(5-32). 


Note  that 


Hw,.)  =  I  ^i^a2j >r^  "  q2(a2j-l,r))* 
J  * 


where 


(r-1)J+a2j>r 


ql(a2i  r  ^  I  K(k) 

1  ^J,r  k=(r-l )J+1 


(5-33) 


(5-34) 


q2(a2j-l,r)  = 


(r-l)J^2j.1>r-l 

k*(r-l)J+l 


and  K(k )  =  KQ(k)  -  K^(k).  Now  let 


h(a2jjr) 


"  ql(a2j,r)  '  2  1n  p(a2j,rl°2j-l,r’  °2j,r-l) 


(5-35) 


9(a2j-l,r)  =  *q2(a2j-l,r)  "  2  ln  p(a2j-l,r la2j-2,r »  a2j-l,r-l)»  (5"36) 

where  p(ai  r,  r_j)  is  approximated  by 

IA  A  A 

a,  ,  r»  ai  r-i )  and  r  is  an  estimate  of  a.  .  (5-32)  then  becomes 

-»■  i i ) i "*  j  fr  j  >  i 


m21  nl/2 


min  {  l  l  [h(a?,  )  +  9(02,-,  )]} 

W  r=mu  j=l  63  i,r 


(5-37) 


m2  ^  n^/2 


I  l  i>in  h(o?i  r>  +  ain  9(a2i-l  1 

r=mn  j=l  a2j,r  °’r  a2j-l,r  c 3  1,1 
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The  minimization  (5-37)  can  be  performed  recursively  in  a  natural  way.  For 

A 

r-»n,  m  is  obtained  by  minimizing  g(a^  m  ),  which  is  a  function 


11 


of  a.  only.  Since  h(a?  m  )  obviously  involves  knowledge 

Cy  ITlji 


of  a,  ,  once  a,  _  is  determined,  a0  can  be  obtained  by 
l.m^  1*mll  ^»mll 

minimizing  h(a9  ).  This  process  is  continuable,  allowing  us  to  obtain  all 
‘•■ll 

the  minima  involved  in  (5-37),  so  that  (5-37)  is  solvable  in  iterative  fashion. 


Proceeding  any  further  requires  an  understanding  of  the  density 

A  A 

functions  p(a  Jo  -  ,  ,  a-  ,).  Reasonable  density  functions  are  obtainable 

Jfl  *  J  —  ljl 

through  our  first-order  understanding  of  the  component's  boundary.  For  example, 
suppose  that  the  sloped  facet  model's  output  is  the  quantity  a,0  for  the 

i, mu 

first  boundary  point  on  line  rr^j.  Assuming  that  we  want  to  perturb  the  boundary 

using  the  filter  results  on  a  finer  grid  than  that  used  in  the  facet  model,  let 

us  choose  a  certain  multiple  x  of  coarse  grid  space  units  as  our  standard 

deviation.  That  is,  if  a  is  the  coarse  grid  spacing  along  a  row,  let  xa  be  the 

standard  deviation,  with  a?  the  assumed  mean.  We  shall  then  assume  a 

*»mH 

Gaussian  density  of  the  form 


1 


P(«l 

A’mll  /2tT 


exp  [  -  (a 


XA 


l,m 


11 


o 

al,m 


11 


>2/2xV] 


(5-38) 


In  general,  suppose  that  we  have  available  a°r  for  every  element  i  of  any  row  r 

in  the  fine  grid  format.  Then  choose  x.  a  units  for  the  standard  deviation  in 

ir 

such  fashion  that  new  choices  for  a.  ,  ,  a.  ,  and  will  follow  the 

i-i,r  l r  i+i, r 

appropriate  order  r<  a..^  <  r  when  the  process  is  completed.  Thus  a 

compromise  is  to  be  effected  between  liberty  of  change  and  disambiguity  of 
location  of  boundary  points.  For  example,  clearly  two  boundary  points 
initially  yA  units  apart  on  a  line  would  require  a  standard  deviation  of  at 
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most  yA/6,  so  that  their  adjusted  locations  would  not  in  all  probability  cause 
a  reversal  of  their  order  on  the  line.  Let  us  also  assume  a  martingale  type 
process,  so  that 


a 


i.r-1* 


(5-39) 


where  E  denotes  the  expected  value  operator.  Then  we  can  assume  that 


P(otirlai-l,r)  =  exp  C_  (air‘  “i^-l^ir2**3 


(5-40) 


where  xir  is  chosen  analogously  to  xir  in  such  fashion  that  a  standard  deviation 
of  x^a  units  leads  to  the  proper  ordering.  That  is,  we  would  prefer  to  use 
updated  values  for  air  as  they  become  available  rather  than  the  air°  provided  by 
the  first  order  approximation.  Of  course,  as  we  examine  each  section  p,  where  1_< 
p  <_NS,  we  may  need  to  use  information  provided  by  the  first  order  methodology, 
especially  when  a  large  number  of  boundary  points  per  line  is  involved 
(corresponding  to  new  branches  of  the  boundary).  Note  that  some  of  the 
difficulties  inherent  in  the  discussion  of  the  Nahi -Jahanshahi  paper  are 
automatically  avoided  by  our  a  priori  knowledge  of  a  reasonable  boundary.  Once 
we  input  (5-38), (5-39),  and  (5-40)  to  (5-37),  we  can  solve  (5-37)  for  a  given 
value  of  mjj.  A  reasonable  starting  value  for  m^  would  be  that  provided  by  our 
first  order  methodology. 


Now  that,  for  a  given  value  of  m^,  (5-37)  has  been  solved,  let  us  now  refer 

A  A  A  .  A  A 

to  relation  (5-30).  Let  f(r)  =  T ( w  )  -  21  n  p(wjw  _,),  where  w  has  the  obvious 

A  '  A  '  "  ^  I 

connotation  wherein  a.  replaces  a.  .  Since  f(r)  is  now  a  known  quantity  for 

jr  r  jr 

every  r,  consider 

m21  A 

min  [-2  In  p(m^)  +  £  f ( r ) J .  (5-41) 

mll  r=mll 

The  density  p(m^)  can  be  assumed  to  be  Gaussian  with  mean  m^0,  where  mjj0 
is  the  first  line  of  the  sloped  facet  image.  Let  the  standard  deviation 
be  y^,  where  y1  is  some  preassigned  positive  number  and  a^  is  the  coarse  grid 
spacing  in  the  column  direction.  For  example,  y^  might  be  taken  to  be  1.  Then 


^  k  * 
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we  have 


p*mll)  =  Tu 


-  (m 


11 


-  m 


11 


On2 


)V2y 


2a  2 
1  A1  ‘ 


ylAl 


(5-42) 


The  minimization  problem  (5-41)  can  then  be  solved  through  a  search  procedure. 
Thus  we  have  a  method  for  analyzing  section  1.  Sections  2  through  NS-1  are 
resolved  in  similar  fashion,  the  only  difference  being  that  lines  and  m^p  for 
the  pth  section,  2  p  _<_  NS  -  1,  are  known,  so  that  the  only  undetermined 
quantities  are  the  a's.  In  general,  mlp  =  ^2,p-l  +  and  m2p  =  ml  p+1"1* 
Finally,  for  section  NS,  m^^^  is  known,  but  is  random.  Analogously  to 

(5~ 42),  we  may  use  our  sloped  facet  model  results  to  procure  a  density  function 
for  m2  If  one  desires,  one  may  use  the  results  thus  obtained  to  proceed  even 
further.  One  now  can  construct  a  new  background  process  and  compare  it  once  more 
with  the  object  process.  Running  through  the  entire  operation  again,  perhaps 
sweeping  the  grid  in  a  different  order,  the  boundary  can  be  further  refined. 

After  several  executions  of  the  method,  that  boundary  corresponding  to  the 
largest  value  of  p(Y,  rt,  U)  could  be  chosen.  A  nonlinear  filter  should  produce  a 
measurement  model  quite  similar  to  (b-9);  and,  running  background  and  object 
nonlinear  filters,  we  could  use  the  estimates  thus  obtained  as  input  to  the 
process  just  described.  We  shall  not  pursue  this  topic  in  any  further  detail. 
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CHAPTER  6 

TARGET  IDENTIFICATION  AND  DESCRIPTION 


GENERAL  APPROACH 

In  some  sense,  the  topic  of  target  identification  and  description  is  the 
most  interesting  of  all  those  items  discussed  thus  far.  Much  literature  has  been 
devoted  to  this  subject,  the  emphasis  being  placed  on  either  a  geometric  or  a 
syntactic  understanding  of  a  preprocessed  image.  For  the  purposes  of  target 
recognition,  it  seems  convenient  to  adopt  a  geometric  approach  of  some  sort  in 
conjunction  with  a  statistical  classification  strategy  if  one  desires  to 
understand  certain  basic  target  components.  On  the  other  hand,  syntactic 
descriptions  can  be  quite  useful  when  one  wants  to  relate  different  parts  of  a 
ship  or  missile.  A  hierarchy  of  understanding  is  possible  in  this  manner, 
allowing  one,  for  example,  to  distinguish  enemy  from  friendly  ships.  In  this 
chapter,  we  shall  cover  both  the  geometric  and  syntactic  approaches,  relying,  for 
our  purposes,  on  methods  previously  developed  in  the  literature. 

Let  us  begin  by  emphasizing  the  importance  of  multispectral ,  or  multiband, 
analyses  of  a  target  prior  to  attempting  to  classify  it.  We  shall  illustrate  the 
idea  from  the  point  of  view  of  infrared  signature  processing.  Imagine  a 
situation,  of  sime  practical  significance,  whereby  a  box  arrives  in  the  mail 
addressed  to  the  President  of  the  United  States.  It  may  be  that  the  box  contains 
a  nice  assortment  of  shirts  and  ties,  but  it  is  also  possible  that  the  box 
contains  a  small  bomb.  By  merely  looking  at  the  object,  one  cannot  ascertain  its 
contents,  but  it  may  well  be  possible  to  do  so  by  making  use  of  infrared  or  sonar 
spectra.  An  analogous  situation  is  present  in  target  recognition  problems.  An 
engine  room  is  likely  to  be  quite  a  bit  warmer  than  a  passenger  space,  yet  the 
geometric  shape  of  both  compartments  could  be  exactly  the  same.  Therefore,  a 
classifier  which  explicityly  takes  into  account  different  wave  bands  and  which 
attempts  to  correlate  the  results  in  these  bands  is  much  more  likely  to  be  robust 
than  otherwise. 
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Let  us  suppose  that  we  have  enumerated  a  number  of  generic  ship  components, 
each  deemed  important  in  some  way  for  recognizing  the  target.  Suppose  that  there 
are  m  of  them,  designated  through  the  set  of  labels  L.. ,  1  <  i  <  m.  Then  the 
first  game  which  we  want  to  play  is  that  of  using  the  gray  level  information  in 
order  to  make  an  initial  assignment  of  a  probability  vector  =  (Pji»Pj2» 
...,Pjm,Pj jIT1+i)  to  each  of  n  target  objects  a^,  1  <  j  <  n.  In  this  notation 
pj.j,  1  <  i  <  m,  is  the  probability  that  object  aj  has  label  L^,  1  <  i  <  m. 

Also,  pj  m+1  is  the  probability  that  aj  has  an  "unimportant  label",  which  we  may 
denote  by  Lm+1.  Now  how  can  one  use  the  gray  level  information  obtained  from 
some  component  aj  in  order  to  procure  the  Pji ‘s?  We  shall  introduce  two  methods 
for  doing  so,  the  first  based  on  knowledge  of  object  boundary  and  the  second 
based  on  gray  level  information. 

BOUNDARY  CLASSIFICATION  METHODOLOGY 

We  shall  first  present  a  classification  strategy  based  on  the  boundary,  as 

Ip 

given  by  Duda  and  Hart  .  It  is  not  a  sophisticated  approach,  since  one  could 
very  well  not  tell  the  difference  between  two  objects  having  the  same  boundary, 
but  entirely  different  temperature  profiles.  However,  it  will  be  seen  to  have 
several  useful  inherent  properties.  In  target  recognition  one  attempts  to  devise 
parameters  for  identification  which  are,  in  some  sense,  independent  of 
orientation  in  the  field  of  view,  position  in  the  field  of  view,  and  range  from 
observer  to  object.  Our  boundary  classifier  will  automatically  have  the  first 
two  properties,  but  not  the  third.  In  some  sense,  this  is  not  detrimental  when 
one  observes  that  it  is  physically  impossible  anyhow  to  build  a  recognition 
capability  totally  insensitive  to  range.  All  one  needs  to  imagine  is  two  objects 
of  the  same  shape  but  of  different  size  which,  due  to  their  respective  distances 
from  the  viewer,  cause  the  same  image  to  be  produced.  Therefore,  what  one  really 
tries  to  do  is  to  make  as  many  parameters  insensitive  to  range  as  is  feasible, 
since,  by  so  doing,  one  is  bound  to  reduce  computational  effort  considerably  when 
one  wants  to  recognize  an  unknown  object. 

Suppose  that  we  have  a  discrete  description  of  the  boundary  and  that  we  have 
connected  the  points  by  a  smooth  curve  (perhaps  using  cubic  spline  approximation 
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methods).  Let  us  look  at  the  curvature  k(s)  of  this  approximate  boundary,  where 
s  is  an  arc  length  parameter,  and  develop  it  into  a  Fourier  series: 

CD 

k(s)  =  l  cnexp(2Trins/L),  (6-1) 

n=-» 


where  L  is  the  length  of  the  boundary  and 
l  L 

c  =  r-  /  k(s)  exp ( -2tt i ns/L)ds . 
n  l  Q 


(6-2) 


We  shall  then  agree  to  use  a  statistically  significant  number  of  the  coefficients 
cn  to  build  a  "feature  vector"  for  the  boundary.  In  other  words,  one  might 
select  a  positive  integer  N  and  use  the  vector  (c_N,  c_N+1,  ....  cq,  cj,  ....  cN) 
to  characterize  the  boundary.  Note  that  curvature  is  an  intrinsic  property  of  a 
smooth  curve  and  thus  is  automatically  independent  of  orientation  or  position  in 
a  field  of  view.  However,  one  sees  that  the  curvature  of  the  image  boundary 
varies  essentially  directly  as  the  distance  of  the  object  boundary  from  the 
observer.  The  reason  is  as  follows:  The  projection  of  a  space  curve  (X(s), 

Y (s ) ,  Z(s))  onto  a  viewing  plane,  as  shown  in  Figure  6-1,  is  given  by12 


u($)  =  g(Y(s))  X(s) 
v(s)  =  g ( V (s ) )  Z(s), 


(6-3) 


where 

g(Y(s))  =  f/(Y(s)+f)  (6-4) 


and  f  is  the  focal  length  of  the  camera  employed.  Of  course,  as  previously 
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FIGURE  6-1.  PROJECTION  OF  A  SPACE  CURVE  ONTO  AN  IMAGE  CURVE 

mentioned,  the  curve  (u(s),  v(s))  is  only  a  smooth  approximation  to  the  actual 
boundary  curve.  It  follow  that  (X(s),  Y(s),  Z(s)),  as  related  to  (u(s),  v(s)) 
through  equations  (6-3),  is  really  just  an  approximate  representation  of  the 
underlying  space  curve.  Now  the  curvature  of  the  image  boundary  is  representable 

pp 

by  the  following  expression  : 


Kj  (s ) 


=  H  (s)v  ~  v  (s)  u  (s) 
(Cu-(s)32  +  Cv'(s)A^/Z 


(6-5) 


p  p 

where  a  prime  denotes  differentiation  and  where  we  assume  that  u  (s)+v  (s)  t  0 

for  any  s.  Forming  the  derivatives  required  in  (6-5),  one  finds  that 
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Kj(s)  =  {2[g*(Y)]2  [y(s)]2  [X(s )  Z  ‘  (s )  -  X*  (s)  Z(s)] 

+  g(Y)  g'(Y)  y  1  (s )  CX(s)  Z 1  ‘ (s)  -  x"(s)  z(s )] 

+  g(Y)g“(Y)  [Y 1  (s) J2[X‘ (s)  Z(s)  -  X ( s )  Z'(s)] 

+  g(Y)  g'(Y)Y"(s)[X'(d)  Z(s )  -  X (s )  Z  ‘  (s  )  ] 

+  g2(Y)  [X'(s)  Z ' ‘ (s )  -  X 1 ' (s )  V  ( s ) ]}  (6-6) 


{Cg‘(Y)]2[Y'(s)]2[x2(s)  +  Z2(s )] 


+  2g(Y )g 1  ( Y )Y ' (s )  [X(s)  X‘(s)  +  Z(s)  Z'(s)‘J 


+  g2(Y)  C(x*(s))2  +  (z‘(s))2]}3/2 


Note  next  that  g(Y)  =  f/Y,  g'(Y)  =  -f/Y2,  and  g' '  (Y)  =  2f/Y3.  Furthermore,  a 
translation  of  origin  along  the  Y  direction  leaves  Y'(s)  and  Y 1 ' (s )  unchanged. 

o  o 

Therefore,  factoring  g  ( Y (s ) )  from  the  numerator  and  g  (Y(s))  from  the 
denominator  of  (6-6),  one  has 


kt(s)  =  QllZIi  FU) +  CX'(S)  z"(5?  -  x1  * (s )  Z*(s)J 
g(Y )  [0(1/Y )G(s)  +  (X ' (s )  i  +  (Z'(s))2]3/2  * 


(6-7) 


where  F(s)  and  G(s)  are  well-defined  functions  of  s  alone  and  where  the  usual 
order  notation  has  been  employed.  Now  let 


1  X 1  ( s )  Z"(s)  -  X1  *  (s)  Z ' (s) 
TT  y  TT7T  * 


KIA(s)  =  Q / y  1  *  *  7*  ‘  2  3/2 

9m  c(x'(S))2  +  iz'isjrr 


(6-8) 


where  we  assume  that  (X‘(s))2  +  (Z'(s))2  t  0  for  any  s.  If  likewise  X’(s)  Z 1 1 (s ) 
-  X"(s)  Z'(s)  t  0,  one  has  KjA(s)/Kj(s)  +  1  as  Y(s)  -*■  +  °>.  Thus,  when  the 
camera  is  translated  along  the  Y  axis,  KjA(s)  is  asymptotic  to  Kj(s)  at  every 
point  where  the  above  requirements  are  met.  Note  further  that  KjA(s)/Kj(s)  tends 
uniformly  to  1,  provided  that  X(s)  and  Z(s)  are  twice  continuously  differentiable 
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and  that  both  (X‘(s))2  +  (Z'(s))2  and  X'(s)Z"(s)  -  X"(s)Z'(s)  are  nonzero  for 
every  s.  Also,  we  may  replace  g( Y ( s ) )  by  g( Yqq )  for  large  distances  from  camera 
to  object,  where  Ygg  is  the  Y  coordinate  of  the  center  of  geometry.  We  now  have 

k?‘5>  =  iI^ykoA(s)-  <6-9> 

where 


k  a(s)  ,  . 

0  C(X' (s)2  +  (Z'(s))2]3/2 


(6-10) 


It  follows  that,  for  large  Ygg,  kQA(s)  can  be  computed  once  one  has  calculated 
k|A($)  and  once  one  knows  Ygg.  It  is  assumed  that  Ygg  would  be  obtained  through 
some  tracking  algorithm.  The  expression  kQA(s)  is  indeed  a  range-independent 
quantity  which  we  can  input  to  (6-1)  and  (6-2)  and  for  which  we  can  build  a 
feature  vector.  Once  we  have  such  a  feature  vector,  what  do  we  do  next?  We  must 
perform  what  is  called  a  target  reconstruction.  That  is,  we  examine  the  object 
from  a  number  of  different  aspects,  and  for  each  aspect  i ,  we  construct  a  feature 
vector^1  =  (c.pj1,  c.^1 ,  ...,  Cq1  ,  Cj1 ,  ....  c^1).  where  1_<_  i  _<_  M  and  there  are 
M  aspects  considered.  Assuming  that  there  is  a  Gaussian  process  underlying  the 
vectors  jc1,  we  then  form  the  sample  mean  vector  £  =  £  cVm  and  the  sample 
covariance  matrix,  whose  entries  are  just  mps  = 


M 


I 

i=l 


cr)  (c,.1  -  cs)/M. 


THE  MOMENT  AREA  METHOD  FOR  TARGET  CLASSIFICATION 

Clearly,  the  method  just  presented  has  built-in  limitations.  One  is  forced, 
first  of  all,  to  "regularize"  the  boundary,  i.e.,  to  smooth  it  in  such  a  manner 
that  corners  are  eliminated.  This  may  or  may  not  be  a  proper  procedure, 
depending  on  the  accuracy  one  wishes  to  achieve.  Secondly,  a  classifier  derived 
through  such  boundary  considerations  may  not  adequately  reflect  the  underlying 
gray  level  signatures.  We  are  thus  led  to  a  more  sophisticated  method  called  the 

p  'i  1 

method  of  moments,  as  studied  by  Hu  *  and  Dudani  .  They  basically  applied  the 
classical  theory  of  moments,  as  developed  by  Cayley,  to  target  detection. 
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Theoretically,  an  image  of  an  object  is  uniquely  describable  in  terms  of  all  the 
moments  of  its  gray  levels.  From  a  practical  point  of  view,  one  needs  to  form  a 
significant  number  of  such  moments.  Then  one  needs  to  combine  them  in  such 
manner  so  as  to  yield  parameters  which  are  invariant  with  respect  to  orientation, 
position  in  the  field  of  view,  and  range.  Since  this  approach  is  wel 1 -documented 
by  Hu  and  Dudani ,  our  discussion  in  this  regard  will  be  cursory. 


Before  we  continue  with  our  discussion  of  the  moment  method,  let  us  prodeed 
a  little  further  with  some  of  the  geometrical  aspects  of  target  imaging.  Such 
aspects  will  be  important  with  regard  to  feature  vector  formation.  The  analysis 
was  performed  by  Dudani  in  his  doctoral  dissertation  and  proceeds  as  follows: 
Suppose  that,  as  above,  we  have  a  camera  coordinate  system  X,  Y,  Z  as  indicated 
in  Figure  6-1.  Let  us  suppose  that  the  axis  system  x,  y,  z  for  our  three- 
dimensional  object  is  initially  aligned  with  this  system.  Therefore,  initially 
we  have 


(6-11) 


Relation  (6-11)  provides  a  fiducial  orientation  for  the  object.  Having  adopted 
such  an  orientation,  we  can  specify  any  new  orientation  in  the  following 
manner:  First  let  us  rotate  the  object  about  the  Y  axis  through  an  angle  0. 

This  produces  a  new  coordinate  system  (x,Y,z)  fixed  to  the  object.  Next  rotate 
about  the  new  z  axis  through  an  angle  \|»,  producing  an  (x',y,z)  system.  Finally 
rotate  about  the  x'  axis  through  angle  $  to  produce  an  (x',y',z')  coordinate 
frame.  Dudani  notes  that  the  product  of  these  three  transformations  leads  to  the 
connection 


“ 

- 

cos  0  0  sin  0 

0  1  0 

cos  -  sin  0 
sin  ip  cos  0 

-sin  0  0  cos  0 

0  0  1 

1  0  0 

X 

0  cos  $  -sin  $ 

y 

0  sin  $  cos  $ 

z 

(6-12) 


where  we  agree,  for  the  sake  of  notational  simplicity,  to  drop  the  primes  on  the 
object  coordinates.  Finally  one  translates  the  origin  of  the  object  frame  of 
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reference  to  a  point  (A,  B,  C)  with  respect  to  the  camera  coordinate  system  X,  Y, 
Z.  One  obtains 


Y  =  sin  iji 


cos  e  cos  \p  -cos  e  sin  \jj  cos  <p  cos  e  sin  4»  sin  4»" 


+  sin  e  sin  <p 
cos  1 p  cos  <p 


+  sin  e  cos  <f> 
-cos  sin  <|> 


y  +  B  (6-13) 


-sin  0  cos  ip  sin  0  sin  cos  41  -sin  0  sin  i|>  sin  <|> 


+  cos  0  sin  <j> 


+  cos  0  cos  <{> 


Suppose  next  that  the  optical  axis  of  the  camera  is  directed  to  pass  through  the 
center  of  geometry  (origin)  of  the  object  frame,  so  that  A=C=0.  Using  (6-3)  with 
yCG  replacing  Y(s),  one  sees  that 


u2  +  v2  =  d2  (X2  +  Z2), 


(6-14) 


where  d  is  a  constant  for  given  YqG  =  B.  Using  (6-13),  together  with  some 

2  2 

algebra,  one  finds  that,  provided  A=C=0,  both  d  and  X  +  Z  are  independent 
of  0.  Therefore,  the  right  side  of  (6-14)  is  solely  a  function  of  <|i,  <j>,  and  B. 
Fixing  x,  y,  2,  ip,  <p,  and  B  and  letting  0  vary  between  U  and  2n,  we  find  that  the 
point  (u,v)  traces  out  a  circle  in  the  uv-plane  of  radius  d(X2  +  Z2)1^2.  In 
other  words,  rotation  of  the  object  in  the  0  variable  about  its  own 
center  of  geometry  merely  produces  rotation  of  the  image  in  the  uv-plane.  It 
follows  that  we  need  to  develop  features  based  on  knowledge  of  i|»  and  4.  On  the 
other  hand,  if  we  can  obtain  features  invariant  with  respect  to  rotation  in  uv- 
coordinates,  we  may  ignore  variations  in  0. 

Let  us  now  go  back  to  our  gray  scale  description  in  terms  of  red  and  blue 
intensities  gp(i,j)  and  gg(i,j)  and  form  the  two-dimensional  moments  of  the  image 


"pq  ‘  8^  1  “i"  tfi" 


(6-16) 


»  6  =  J- 

DO 
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where  gTR  =  l  gR(i,j),  gTB  =  l  gB(i,j),  and  the  sums  are  over  all  pixels  in 
the  image.  The  red  and  blue  intensities  in  (6-15)  represent  the  outputs  of  a 
time-varying  random  process.  Typically,  they  might  be  the  expected  values  for 
red  and  blue  signatures  under  a  given  set  of  environmental  conditions,  which 
might  include  solar  radiation,  cloud  cover,  atmospheric  emissions,  and 
atmospheric  haze.  In  other  words  they  should  typify  outputs  corresponding  to 
actual  conditions  to  which  a  target  would  be  subjected.  Out  of  (6-15),  one  can 
construct,  based  on  the  work  of  Hu  and  Dudani ,  the  so-called  similitude  moment 
invariants  with  respect  to  size,  elevation,  and  distance  B  along  the  optical 
axis.  Let  these  be  denoted  by  M.  (ip,  <(»)  and  (ij> ,<j>),  X  _<_  i  _<^  NI ,  where  there 
are  NI  invariants  considered  based  on  the  red  and  blue  information.  These  moment 
invariants  become  the  elements  of  our  feature  vectors.  Just  as  with  our  boundary 
vector  (c_N,  c_n+1 , . . . ,cq,  cj,...,c^),  we  may  consider 


(MXj  (4'i  ,+i ) ,  MXj  (fi,*i),  M2j  (4'i  ,4>i ) ,  M2j  (fi,^i  ),...,Mni>j 


D 

MNI  J  as  a  Russian  vector,  where  1  <  i  <  pa  covers  pa  aspects 


and 


1  <  j  <  pe  covers  pe  sets  of  environmental  conditions.  A  sample  mean  and  a 
sample  covariance  matrix  for  the  pa  aspects  and  the  pe  environmental  situations 
may  then  be  constructed. 

One  may  imagine  selecting  a  number  of  interesting  objects  a.,  1  <  j  <  n,  as 

J 

we  mentioned  at  the  beginning  of  this  chapter,  with  the  intent  of  assigning  a 
label  probability  vector  £j  to  each,  based  on  our  feature  vectors  (using  either  a 
moment  area  technique  or  a  boundary  technique  or  both).  Imagine  that  we  have 
built  a  Gaussian  distribution  for  each  label  L-j ,  1  <  i  <  m+1.  Then,  following 
Dudani,*  we  shall  adopt  the  Bayesian  approach,  wherein  one  writes 

P(Cj|e)  =  p(e|Cj)P(Cj)/p(PJ.  (6-16) 

Here  p(g|C^)  is  the  probability  density  for  the  feature  vector  g,  given  that  it 
belongs  to  class  Cj,  P (C j )  is  the  a  priori  probability  of  occurrence  of  class  Cj, 
and  p(p)  is  the  probability  density  for  occurrence  of  g,  regardless  of 
category.  Obviously, 
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m+1 

p(p)  -  l  p(p|C  .)P(C  •), 

j=l  J  J 


(6-17) 


where  Cj  is  the  class  for  label  Lj. 


For  simplicity,  one  may  assume  that 
p(Cj)  =  l/(m+l)  for  every  j  if  one  does  not  have  any  better  a  apriori 
understanding  of  the  occurrence  of  classes.  In  that  case  (6-16)  becomes 


m+1 


p(Cj |p)  =  p(p|Cj)/  i  p(e|cj). 

j  i 


(6-18) 


In  any  event  the  probabilities  afforded  by  (6-16)  or  (6-18)  would  become  the 
components  of  our  probability  assignment  vector  for  object  aj. 


RELAXATION  METHODOLOGY  FOR  TARGET  CLASSIFICATION 


We  come  now  to  one  of  the  most  important  phases  of  the  classification 
process,  namely,  that  of  unifying  our  knowledge  about  the  individual  components 
of  a  target  in  order  to  understand  the  target  as  a  whole.  This  is  the  purpose  of 
the  so-called  relaxation  methodology  which  we  now  introduce.  The  theory  has  been 
developed  systematically  in  papers  by  Rosenfeld,  Hummel,  and  Zucker,  ^ 


03 

Pel  eg  and  Rosenfeld,  Zucker,  Krishnamurthy ,  and  Haar,  Davis  and 


pk  k 

Rosenfeld,  and  Eklundh,  Yamamoto,  and  Rosenfeld.  Now  there  are  two  basic 
types  of  probabilistic  relaxation  schemes,  which  we  may  call  linear  and 
nonlinear,  respectively.  Both  are  studied  in  the  paper  by  Rosenfeld,  Hummel,  and 
Zucker  just  quoted.  Therefore,  we  shall  not  enter  into  any  elaborate  details 
with  regard  to  convergence  except  for  mentioning  that  validity  of  the  method  has 
been  proved  under  certain  standard  conditions  for  the  linear  case,  but  has  not 
been  established,  in  general,  for  the  nonlinear  method.  Nevertheless  experiments 
have  shown  that,  in  practice,  the  nonlinear  procedure  works  well  and  yields  more 
realistic  results  than  does  the  linear  one. 


We  first  present  the  linear  relaxation  model  and  show  how  one  adapts  it  to 
the  target  classification  problem.  The  model  given  by  Rosenfeld,  Hummel  and 
Zucker  is 


Pi(x)  =  l  ciJ[XiPiJ(x|x,)  Pj(x')j,  1  <  i,j  <  n, 
J  A 


(e>-iy) 


with  the  following  notational  clarification: 
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(c^j)  -  doubly  stochastic  coefficient  matrix  of  positive  elements,  i.e.,  one  for 
which  each  row  and  column  sums  to  unity. 

Pij  ( A  j A ' )  -  probability  that  object  a^  has  label  A,  given  that  object  aj  has 
label  A*. 

pi (a)  -  probability  that  object  a^  has  label  A. 

Also,  p1i (A | A 1 )  =  1  if  A  =  A1  and  0  otherwise.  Basically  one  is  searching  for  a 
fixed  point  {p.(A)},  1  <  i  <  n,  for  any  given  label  A.  By  the  Brouwer  fixed 
point  theorem  ,  such  a  fixed  point  always  exists.  The  game  is  to  start  with 
some  initial  probabilistic  labeling  p.(A)  for  each  of  our  objects  ai  and 

J  « 

iteratively  apply  (6-19)  until  convergence  to  a  stochastic  labeling. 

Now  let  us  interpret  (6-19)  in  the  light  of  target  modeling.  First  of  all, 
note  that  the  target  is  viewed  at  some  aspect,  say  Ut,  <{>t,  et),  where  the  angles 
are  the  usual  Euler  angles  of  the  type  we  mentioned  at  the  beginning  of  this 
chapter.  This  induces,  of  course,  an  Euler  representation  (xp^. j j *0t: j  )  for 
every  object  aj.  In  Dudani's  thesis  two  procedures  were  introduced  for 
calculating  moment  invariants,  the  first  of  which  made  use  of  a  principal  axis 
transformation  and  the  second  of  which  produced  so-called  orthogonal  moment 
invariants.  The  second  procedure  would  not,  in  our  case,  be  generally 
applicable,  since  it  requires  that  the  object  be  rotated  about  its  own  center  of 
geometry.  In  our  case,  the  entire  object,  such  as  a  ship,  is  being  rotated  about 
its  center  of  geometry.  Clearly,  for  example,  an  engine  room  is  not  being 
rotated  about  its  own  center  of  geometry.  The  analysis  given  by  equations  (6-11) 
-  (6-14)  still  applies,  but  with  the  axis  of  the  camera  system  being  directed 
toward  the  center  of  geometry  of  the  ship,  missile,  etc.,  involved.  One  word  of 
caution,  however,  is  in  order  here.  As  in  any  mathematical  analysis  of  a 
physical  situation,  one  needs  to  focus  his  attention  on  the  interrelationships 
that  exist  and  to  be  certain  that  the  mathematical  theory  adequately  reflects  the 
real  world  conditions.  If,  for  example,  one  is  attempting  to  classify  a  ship  or 
an  airplane,  the  environment  to  which  it  is  exposed  may  have  a  significant  impact 
on  results  obtained  through  infrared,  as  opposed  to  visual,  processing.  There 
may  be  a  substantial  difference  in  results  obtained  during  the  day  as  opposed  to 
those  gathered  at  night.  In  the  former  case,  one  would  normally  have  to  consider 
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the  yaw,  roll,  and  pitch  of  the  target,  since  each  set  of  angular  quantities 
would  correspond  to  an  entirely  different  radiation  pattern.  In  the  latter  case, 
it  is  expected  that  environmental  factors  (such  as  solar  heating)  would  be 
minimized,  leaving  intrinsic  sources  of  energy  to  govern  the  signal  output.  Then 
it  would  be  possible  to  show  that,  if  one  were,  for  example,  to  view  the  target 
broadside,  pitching  motion  would  not  affect  the  gray  level  output.  That  is,  any 
motion  in  a  plane  orthogonal  to  the  viewing  axis  would  be  unimportant  with  regard 
to  target  component  discrimination.  Now  let  us  use  (6-3)  and  (6-4)  with  Y(s) 
replaced  by  Ycg,  the  latter  being  understood  to  be  the  Y  coordinate  of  the  entire 
object's  center  of  geometry.  Let  us  assume  that  the  ship  or  missile  is  far 
enough  away  so  that  each  point  of  it  can  be  considered  to  be  at  distance  Ygg  from 
the  camera.  Next  let  vpp  be  the  pq  central  moment  of  the  gray  levels  for  the 
original  (X,Z)  coordinates  and  ypp  the  corresponding  central  moments  for  the 
image  in  (u,v)  coordinates,  which  are  parallel  to  the  camera  coordinates  (as 
shown  in  Figure  6-1).  If  there  are  N  pixels  in  the  component  of  interest,  one 
has 


PQ 


I 


gITR  N  pixels 


9IR(i»j)  (ui 


u)P(Vj  -  v)q. 


(6-20) 


where 


u  =  I  9 

N  pixels 

v  *  l  9 

N  pixels 


IR 

IR 


(i » j )ui /gITR 


(6-21) 


Here  9ir ( "» , J )  is  the  image  red  signature  intensity  at  pixel  (i,j),  and 

9ITR  =  E  g,R(i,j).  Let  (xjj»  z-jj)  be  the  x  and  z  coordinates  of  an 

N  pixels  1  J 

object  point  which  is  projected  to  the  image  point  (uj,Vj).  Also,  let 


%  =  l  9 

n  pixels 

l  -  l  9 


n  pixels 


tR( 

i ,  j  )x-j 

j/gtTR 

tR( 

i  o)Zi 

j/9tTR’ 

(6-22) 
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where  gtR(i,j)  is  the  target  red  signature  intensity  for  surface  pixel  (i,j)  and 

9tTR  *  I  .  ,  9tR(t.0)-  Now  u(  =  dXfj>  u  =  dX,  »  -  dZ^,  and  v  =  dZ, 

N  pixels 

where  d  =  ( ycg  +  f).  Furthermore,  assume  that  there  exists  a  red  attenuation 

factor  aR(d),  solely  a  function  of  the  distance  traversed  from  object  to  camera 
(or  detector)  by  the  red  component  output.  Then,  for  every  (i,j). 


g j r ( i » j )  =  3R (d  )g^R  ( i ,  j  ) . 

Substituting  (6-23)  into  (6-2U),  we  find  that 


where 


(6-23) 


(6-24) 


-Z,q' 


(6-25) 


Note  that  (6-24)  does  not  involve  the  attenuation  factor  aR(d).  Also,  we  are 
assuming,  in  our  analysis,  that  radiation  propagates  along  straight-line  paths, 
i.e.,  that  any  refraction  phenomena  are  of  secondary  significance.  Now,  using 
the  fact,  from  (6-24),sthat  uZQ  =  d2v2Q  and  eliminating  d2,  one  sees  that 

»  /wJp+q>'2  is  a  size  invariant.  That  is  to  say,  if  u  n(1)  is  the  result  of 

pq  ^  (21  pq 

using  dj  in  (6-24)  and  y  v  '  is  the  result  of  using  d2,  one  has 


u  (1)/(u  U))(P+q)/2  ,  (2)/(  (2)}(p+q)/2  _  Furthermore,  we  would  like  to 

pq  20  pq  2u 

use  for  u  certain  fiducial  quantities,  namely,  central  moments  with  respect  to 

Pq 

principal  axes.  Assuming  that  our  origin  in  the  image  plane  is  at  the  red 
intensity  center  of  geometry  of  the  component  image  and  employing  the  rotation  of 
coordinates 


u*  =  u  cos  e  -  v  sin  e 

v'  =  u  sin  9  +  v  cos  9,  (b-26) 

as  Dudani  does,  one  finds  that 


2p^i  =  (u2q  ”  Wq2 )  sin  29  +  2p 1 1 c o s  29. 


(6-27) 
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By  definition,  the  principal  axes  are  those  for  which  =  0.  This  leads  to 
the  condition 

tan  20  =  -  2y^/(y ^  -  y^).  (6-28) 

The  angle  e,  as  given  by  (6-28),  is  the  orientation  of  the  principal  axis  system 

relative  to  our  uv  coordinate  frame  (parallel  to  the  camera  frame).  Our 

invariants  to  size  and  rotation  then  become  M‘  =  y  '/(uon'l^^^i  provided 

pq  pq  tU  r 

that  (p,q)  *  (2,0).  Note  that  may  be  computed  directly  without  any  knowledge 
of  distance  from  object  to  image.  There  is  one  more  invariant,  namely, 

U2o'/d  ,  which  requires  a  knowledge  of  d  for  its  determination.  Through  this 
invariant  one  gains  an  understanding  of  target  component  size.  Of  course  one  can 
perform  the  same  sequence  of  computations  for  the  blue  intensities. 

When  the  objects  aj  are  projected  into  the  image  plane  of  a  camera,  they 
will,  of  course,  adopt  certain  positions  in  that  plane  relative  to  one  another. 
The  probabilities  p  -  - ( X  | A  ' )  in  (6-19)  represent  compatabilities  that  must  exist 

'  J 

among  such  objects  and,  as  such,  should  be  prespecified.  Suppose  now  that  one  is 

addressing  k  target  types,  say  {T^},  1  <  i  <  k .  Suppose,  also, 

that,  for  each  target  type  T.j,  we  build  a  training  set  for  a  number  of  target 

pi  pi  pi 

aspects  (et  ,  <f>t  ,  ),  1  <  r  <  p,  there  being  p  overall  views  considered. 

Such  training  sets,  inasmuch  as  they  might  be  employed  to  derive  distance 
relationships  among  components,  could  be  obtained  under  idealized  conditions. 

For  example,  the  target  itself,  say  a  ship,  could  be  studied  as  a  model  in  a 
laboratory  setting  if  one's  only  interest  is  to  procure  the  distance  between  the 
engine  room  and  passenger  quarters  at  any  given  target  aspect.  A  histogram 
analysis  could  then  be  performed  over  all  the  ship  types  T^,  1  <  i  <  k,  and 
numbers  p--(x|x';D)  could  be  derived  over  all  ship  types  and  aspects.  Here 
p.  . (x J X 1  ;D )  means  specifically  the  probability  that  object  a^  has  label  A,  given 
that  object  aj  has  label  a'  and  is  at  distance  D  from  a^.  There  is  an 
alternative  point  of  view  which  could  be  taken  at  this  point  which  should  be 
mentioned.  Note  that,  on  calculating  p(p|C.)  for  input  to  (6-16),  we  have 

J 

identified  Cj  with  the  jth  ship  type,  so  that  Cj  is  really  Tj.  In  effect,  we 
have  regarded  all  out  target  image  feature  vectors,  regardless  of  target 
orientation,  as  coming  from  the  same  Gaussian  distribution.  In  parallel  with 
this  concept,  it  is  reasonable  to  calculate  p.,(a|a';D)  based  on  all  aspects  of 

*  J 
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the  target.  A  better  way  of  approaching  the  task  perhaps  is  that  taken  by 
Dudani .  Instead  of  identifying  Cj  solely  with  Tj,  one  includes  in  Cj  the  target 
aspect  (0 1 ,4> ^ ) ,  so  that  the  basic  Gaussian  density  P (e. [ C j )  has  a  different 
statistical  interpretation.  Feature  vectors  are  constructed  under  varying 
environmental  conditions,  but  for  a  given  target  aspect  and  target  type.  In  the 
spirit  of  Dudani's  approach,  the  prior  probability  P ( C j )  in  (6-16)  is 


p(Cj)  =  p(Tj)p(i|>t|<J>t  )p(*t|*t  )p(et|et  ),  (6-29) 

wherein  P(Tj)  is  the  probability  of  the  target  type  Tj,  p (^ 1 1 ip t  )  is  the  Gaussian 

o 

density  for  yaw  given  a  nominal  value  i|>t  ,  p (<p 1 1 <p t  )  is  the  Gaussian  density 

o  o 

for  roll  <j>t,  given  nominal  ,  and,  finally,  p (e^ [0 1  )  is  the  density  for 

0  0 

pitch  0  ,  given  nominal  pitch  0.  .  Likewise  we  should  now  understand 

zo 

P-jjU|A';D)  a  little  differently.  We  should  again  perform  a  histogram  analysis, 
but  condition  it  on  target  aspect.  One  then  needs  to  integrate  (6-19) 
over  all  target  aspects,  i.e.,  one  considers 


If/ 2  Ztt  2tt 

1,1  l  Pi(xl9t>*t.*t,p(9tl6t  )p(*tK  )p<*tl*t  )d9td*td*t 

— TT  /  c  \j  u  0  0  0 

(6-30) 


n  /  2  2tt  2-jt 

/  J  /  Pi i (x |x * ;D,0t,ip  ,<pt )p. (x * |ot ,^pt ,♦* )p(et |ot  ) 
j  X'  -it/ 2  0  0  1J  t  t  t  j  t  t  t  t' 

•  P(*t|*t  )pUtkt  )d0tdiptd*t] 
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If  the  Gaussian  (or,  possibly,  uniform)  random  variables  6t,  \|»t,  and  <t»t  in  (6-29) 
have  small  variances,  then  we  may  approximate  p1j(x|x';  D ,e t t )  by 
pi  . (x )x 1  ;D,et  ,\|>t  ,4>t  ).  In  that  case  (6-30)  becomes 


ooo 


Pi(x|et  ,*t  ,*t  )=  l  cij[^pij.(x|x,;D,et  ,$t  ) 

0  0  0  J  A 


(6-31) 


ooo 


•  Pj(x  9^.  )] 


ooo 


This  mechanism  is  theoretically  perhaps  more  rigorous,  but,  at  the  same  time,  may 
require  much  more  work  effort  to  implement. 

Either  (6-19)  or  (6-31)  could,  in  principle,  be  used  in  an  iterative  scheme 
for  procuring  consistent  labeling  vectors  (p.(x))  for  the  components  a^.  Having 
thus  identified  the  components,  together  with  their  relative  positions,  one  may, 
at  a  higher  level,  be  able  to  say,  with  some  degree  of  certainty,  what  type  of 
target  is  present.  If  the  target  changes  heading  or  if  we  permit  ourselves  to 
walk  around  the  target  in  order  to  improve  our  understanding  of  it  and  rerun  the 
basic  iterative  scheme,  there  is  a  good  chance  that  we  can  improve  our 
understanding  of  the  category  of  target  involved. 

We  would  also  like  to  say  that  certain  topological  properties  might  be 

07 

useful  in  target  classification.  For  example,  is  the  object  aj  we  are 
considering  convex?  Is  it  connected?  If  it  is  connected,  what  can  be  said  about 
the  number  of  holes  contained  in  the  image?  Such  higher  level  information  could 
certainly  be  used  as  a  logical  input  to  classification  of  components  of  a 
target.  This  prior  knowledge  could  indeed  be  utilized  in  defining  p(Tj)  as  input 
to  (b-29). 

Let  us  now  proceed  with  the  nonlinear  probabilistic  model  suggested  by 

Kosenfeld,  Hummel  and  Zucker.  It  turns  out  that  such  a  model  is  more  useful  than 

the  linear  model  in  two  different  ways:  (1)  It  appears  to  converge,  in  a  large 

number  of  practical  applications,  to  a  limit  not  depending  on  initial  assignment 

probabilities.  (2)  It  exhibits  favorable  compatibility  properties  with  regard 

to  generation  of  p^(x)  consistent  with  correlations  between  objects.  The  model 

is  predicated  on  certain  known  correlations  rather  than  compatibilities.  The 

correlations  are  denoted  by  r. .(x,x‘),  where,  of  course,  |r.(x,x')|  <  1.  One 

( k  )  1 J  [b\  1  J  ' 

defines  a  quantity  q/  '(x)  analogous  to  the  p^ v  '  (\)  (the  kth  iterate 

of  p.(X))  defined  in  the  linear  model: 
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(k) 


(x)  - 


c1JC 


r,  (x,x,)p.{k)(x')]t 

>  >  J  J 


(6-32) 


where  again  the  c^j  are  nonnegative  and  the  matrix  (c^j)  is  doubly  stochastic. 
Then  we  define 

p.(k+1)(A)  =  Pl(k)(x)[l+qi(k)(x)]/  l  pi(k)(x)[l+qi(k)(x)].  (6-33) 

It  turns  out  that  |q^k^(x)|  <  1,  so  that,  from  (6-33),  0  <  p^  ^k+^(x)  <  1. 
Again,  we  can  certainly  invoke  a  model  similar  to  our  linear  version  in  the  case 
where  aspect  distinctions  are  not  made.  However,  when  they  are  made,  as  in  the 
case  (6-26)  just  studied,  it  now  becomes  impossible  to  parallel  the  modeliny  in 
(6-30),  which  goes  through  because  of  the  linearity  of  the  operator.  On  the 
other  hand  if  we  assume  that  we  know  the  values  of  \j>  and  <f> ,  i.e.,  if  <|>  =  <|>0, 

<j>  =  <p  and  e=e,  then  we  need  not  average  as  we  do  in  (6-30),  and  one  merely 
applies  (6-33),  given  the  aspect  {0o,i|>o,<j>o).  The  problem  we  now  pose  is  that  of 
providing  the  inputs  r..(x,x‘).  For  a  given  aspect,  we  suppose  that 

’  J 

r.  .(a, A1)  depends  on  the  distance  D  between  the  image  of  object  a^  and  the  image 
of  object  aj.  Following  Rosenfeld,  Hummel  and  Zucker,  let  A  be  the  event  that 
object  a.j  has  label  A,  and  let  B  be  the  event  that  aj  has  label  A1.  Then 


r^U.A'jD) 


p  (A  and  B ; D )  -  p(A)p(B) _ 

[ ( P ( A )  -  p2(A) )  (p(8 )  -  p2(B))]1/2  * 


The  quantity  p(A  and  B;D)  is  the  probability  that  a^  has  label  a  and  aj  has 
label  a'  when  it  is  observed  that  they  are  separated  by  0  units  in  the  image 
plane.  We  can  compute,  as  before,  a  frequency  of  joint  occurrence  of  labels 
A  and  a'  for  objects  a^  and  aj  separated  by  D  units,  where  U  <  0  <  ».  Also  ,  we 
can  calculate  the  frequency  of  occurrence  of  a  given  label  A  over  all  target 
types  and  input  that  quantity  for  p(A).  A  similar  statement  pertains  for 
computing  p(B). 
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OPTIMIZATION  OF  STOCHASTIC  LABELING 

28 

A  scheme  presented  by  Pel  eg  is  of  great  importance  in  deciding  upon  the 

merit  of  a  stochastic  labeling  and,  subsequently,  making  a  specific  labeling 

assignment.  Pel  eg  assumes  that  one  wants  to  maximize  the  product  of  two 

probabilities,  one  of  which  he  calls  the  stochastic  probability  P^(n)  for  a  given 

labeling  a  of  nodes  of  a  network  and  the  other  of  which  P u(a)  he  calls  the  model 

n 

probability  corresponding  to  the  labeling  a.  The  function  to  be  optimized  is 

P(n)  =  P$(n)PM(fl).  (6-34) 

Suppose  that,  after  executing  a  relaxation  procedure  as  previously  described,  we 
obtain  a  stochastic  labeling  for  every  object  aj,  i.e.,  we  have  a  label 
probability  assignment  vector  for  each  aj.  Then  choose  that  label  at  aj  for 
which  the  probability  is  maximum,  and  let  n  =  (aj,a2,...,an),  where  n  is  the 

number  of  objects  and  is  the  label  for  object  a^  yielding  maximum 

probability.  Let  P<.(n)  be  the  product  of  these  n  maximum  probabilities.  Next 
form  the  quantity  PM(n),  which  is  the  a  priori  probability  for  the  given  label 

assignment.  This  can  be  obtained,  at  least  approximately,  using  the 

compatibilities  previously  developed  (by  histogram  techniques).  Now  one  can 
adopt  the  following  strategy:  One  may  choose  not  to  iterate  (6-31)  or  (6-33)  to 
a  fixed  point,  but  instead  monitor  the  output,  stopping  at  that  point  where  (6- 
34)  is  maximum.  Also,  one  may  employ  several  viewer  aspects  in  conjunction 
with  (6-31),  (6-33),  and  (6-34),  again  choosing  that  ft  for  which  (6-34)  is 
maximum.  In  this  way  one  procures  a  specific  labeling  from  which  one  can  better 
identify  the  target. 


6-18 


NSWC  TR  84-54 


CHAPTER  7 

SUMMARY 

The  intent  of  this  report  has  been  to  attempt  a  unification  of  ideas  on 
image  modeling  presented  in  recent  years.  Also,  it  is  felt  that  originality  has 
been  injected  into  the  basic  discrimination  procedures,  so  that  more  meaningful 
directions  may  be  pursued.  We  have  emphasized  the  early  processing  of 
information,  since  that  is  of  paramount  importance.  The  labeling  evaluation 
procedures  given  in  Chapter  6  would  certainly  be  meaningless  unless  one  can 
construct  reasonable  feature  vectors  as  prototypes  of  the  classes.  One  of  the 
main  purposes  of  the  work  would  be  to  make  available  a  more  sophisticated 
methodology  for  target  recognition,  so  that  one  could  ultimately  distinguish 
enemy  from  friendly  targets. 

In  the  future  it  is  hoped  that  this  work  can  be  interfaced  in  some 
meaningful  way  with  data  from  realistic  scenarios.  Hopefully  a  computer  code  can 
be  developed  around  the  ideas  presented  here. 
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